structured generation

Algorithm described in paper/blog and implemented in outlines.

For a sequence of tokens , predict the next token

where we can perform multinomial sampling from the probability distribution of output logits.

The idea of guided generation is to restrict support of the possibilities of the vocabulary. Forbidding sequences that do not respect the structure of the possible vocabulary.

naive approach

apply a boolean mask that restricts the support of the original distribution
renormalise output logits
perform multinomial sampling

cost of deciding whether each possible token is in restricted vocabulary at each step.

efficient approach (finite state machine)

Use a finite state machine tuple, , to efficiently continue the state machine without reading from the beginning of the growing sample sequence each time.

, finite set of states
, finite alphabet
, transition function
, start state
, set of accept states

Need to pre-process the vocabulary using the regular expression’s FSM and build an index (hash map for mask with ) to find subsequences of FSM that accept each string:

consider starting in every viable FSM state
determine the valid next tokens if transition function produces a viable state.

notes

Explorer

structured generation

naive approach

efficient approach (finite state machine)

Graph View

Table of Contents

Backlinks