Slides: Random Distributed Scalar Encoder

---

# Random Distributed Scalar Encoder
Chetan Surpur, Numenta
.footnote[As used in [NuPIC](https://github.com/numenta/nupic)]

---

# Agenda

1. Introduction

2. Review: Basic Scalar Encoder

3. Why Random Distributed Scalar Encoder?

4. Objectives

5. Algorithm w/ Example

6. Parameters

7. Intuitions

8. Results

9. Q&A

---

# Basic Scalar Encoder

![Example](/images/slides/numenta-rdse/basic-scalar-encoder.png)

---

# Why RDSE?

- Basic Scalar Encoder saturates representation of input at edges

- Have to decide on min / max up front

- Can't change the range of the encoder at run-time, or learning will be lost

---

# Objectives of a Scalar Encoder

- Represent each bucket with a number of bits

- Adjacent buckets should share many bits

- Non-adjacent buckets shouldn't share any bits

---

# Algorithm

- For the first bucket, select a number of bits randomly

- Remember which bits were assigned to which bucket

- When moving along the scalar number line, and a new bucket is needed:

- Look up which bits were assigned to the previous bucket (let's call it P)

- Make a new set of bits to represent the new bucket that has all but one of the bits in P and one randomly* selected bit not in P

- When assigning a scalar to an existing bucket, just look up which bits were assigned to the target bucket

---

# RDSE Example

![Example](/images/slides/numenta-rdse/random-distributed-scalar-encoder.png)

---

# Parameters

- 'n' is total number of available bits

- 'w' is number of bits representing each bucket (width)

- 'r' is how "wide" a bucket is (resolution)

---

# Intuitions

- Each bit contributes to representing multiple adjacent buckets

- As you create more and more buckets, the chances of selecting a bit for a new bucket that already represents an existing non-adjacent bucket increases

- Eventually, when the number of generated buckets saturates, it'll be hard for the SP to tell which scalar range (bucket) a particular bit is representing

- But this happens smoothly, and learning is never lost since representations never change, only become more blurry

---

# Intuitions (cont.)

- The bigger the 'n', the less likely bits will be used for many non-adjacent buckets

- The bigger the 'w', the more similarity will be attributed to adjacent buckets

- The bigger the 'r', the more scalars will be grouped together into buckets

---

# Results

- RDSE is better than Basic Scalar Encoder, at least for our datasets (Reference: Subutai)

- Easier to configure and use

- No reason to use Basic Scalar Encoder instead of RDSE anymore (other than as a tool for explaination)

- Caveat: RDSE is not an adaptive encoder that learns the statistics of the input and allocates buckets accordingly

- Those kinds of modifications are still open for experimentation

---

# Thank you.
Questions?