deepword package¶

Subpackages¶

Submodules¶

deepword.action module¶

class deepword.action.ActionCollector(tokenizer: deepword.tokenizers.Tokenizer, n_tokens: int, unk_val_id: int, padding_val_id: int)¶

Bases: deepword.log.Logging

Collect actions for different games

__init__(tokenizer: deepword.tokenizers.Tokenizer, n_tokens: int, unk_val_id: int, padding_val_id: int) → None¶

Parameters

tokenizer – see deepword.tokenizers
n_tokens – max allowed number of tokens for all actions
unk_val_id – ID of the unknown token
padding_val_id – ID of the padding token

property action2idx¶: Current action to token IDs

property action_len¶: Current action lengths

property action_matrix¶: Current action matrix

property actions¶: Current actions in string.

add_new_episode(gid: str) → None¶

Add a new episode with game ID.

Parameters: gid – game ID, a string that separate different games.

extend(actions: List[str]) → numpy.ndarray¶

Extend actions into ActionCollector.

Parameters: actions – a list of actions for current episode of game-playing.

get_action_len(gid: Optional[str] = None) → numpy.ndarray¶

Get action lengths for a game

Parameters: gid – game ID, None falls back to current episode of game.
Returns: an array of int, each element is a length for that action

get_action_matrix(gid: Optional[str] = None) → numpy.ndarray¶

Get action matrix for a game.

Parameters

gid – the game ID. None falls back to current activated game.

Returns

an array of actions, each action is a vector of its IDs to tokens: that are filled with padding in the end to reach the same token size.

get_actions(gid: Optional[str] = None) → List[str]¶

Get all actions for a game.

Parameters: gid – game ID, None falls back to current episode of game.
Returns: a list of actions in string

get_game_ids() → List[str]¶: Get all game IDs in this ActionCollector.

load_actions(path: str) → None¶

Load all actions in this ActionCollector

Parameters: path – a path to a npz file.

save_actions(path: str) → None¶

Save all actions to a path as a npz file.

Parameters: path – a npz path to save

deepword.dependency_parser module¶

class deepword.dependency_parser.DependencyParserReorder(padding_val: str, stride_len: int)¶

Bases: deepword.log.Logging

Use dependency parser to reorder master sentences. Make sure to open Stanford CoreNLP server first. Refer to https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

The DP reorder class is used with CNN layers for trajectory encoding. Refer to https://arxiv.org/abs/1905.02265 for details.

__init__(padding_val: str, stride_len: int) → None¶

Parameters

padding_val – padding token, e.g. ‘[PAD]’ or ‘O’
stride_len – CNN stride len

reorder(master: str) → str¶: Use dependency parser to reorder a paragraph.

deepword.eval_games module¶

class deepword.eval_games.EvalResult(score, positive_score, negative_score, max_score, steps, won, action_list)¶: Bases: deepword.eval_games.EvalResult

class deepword.eval_games.FullDirEvalPlayer¶

Bases: deepword.log.Logging

classmethod start(hp, model_dir, game_files, n_gpus, range_min=None, range_max=None)¶

class deepword.eval_games.LoopDogEvalPlayer¶

Bases: deepword.log.Logging

start(hp, model_dir, game_files, n_gpus)¶

class deepword.eval_games.MultiGPUsEvalPlayer(hp, model_dir, game_files, n_gpus, load_best=True)¶

Bases: deepword.log.Logging

Eval Player that runs on multiple GPUs

evaluate(restore_from: str, debug: bool = False) → None¶

Evaluate an agent

Parameters

restore_from – path to restore weights
debug – if debug, multi-threads will be disabled

has_better_model(total_scores: float, total_steps: float) → bool¶

Whether or not the current model is better

Parameters

total_scores – total scores earned
total_steps – total steps used

save_best_model(loaded_ckpt_step: int) → None¶

Copy current model to the best model dir

Parameters: loaded_ckpt_step – which model to copy

classmethod split_game_files(game_files: List[str], k: int, rnd_seed: int = 42) → List[List[str]]¶

Split game files into k portions for multi GPUs playing

Parameters

game_files – a list of games for playing
k – number of splits
rnd_seed – random seed

Returns

a list of list of game files

class deepword.eval_games.NewModelHandler(hp, model_dir, game_files, n_gpus)¶

Bases: watchdog.events.FileSystemEventHandler

is_ckpt_file(src_path)¶

on_created(event)¶

Called when a file or directory is created.

Parameters: event (DirCreatedEvent or FileCreatedEvent) – Event representing file/directory creation.

on_modified(event)¶

Called when a file or directory is modified.

Parameters: event (DirModifiedEvent or FileModifiedEvent) – Event representing file/directory modification.

run_eval_player(restore_from=None, load_best=False)¶

class deepword.eval_games.WatchDogEvalPlayer¶

Bases: deepword.log.Logging

start(hp, model_dir, game_files, n_gpus)¶

deepword.eval_games.agent_collect_data(agent, game_files, max_episode_steps, epoch_size, epoch_limit)¶

deepword.eval_games.agg_eval_results(eval_results: Dict[str, List[deepword.eval_games.EvalResult]], max_steps_per_episode: int = 100) → Tuple[Dict[str, deepword.eval_games.EvalResult], float, float, float, float, float, float]¶

Aggregate evaluation results. We run N test games, each with M episodes, each episode has a maximum of K steps.

Parameters

eval_results –
evaluation results of text-based games, in the following format:

dict(game_name, [eval_result1, …, evaluate_resultM]) and the number of eval_results are the same for all games. evaluate_result:

score, positive_score, negative_score, max_score, steps,
won (bool), used_action_list
max_steps_per_episode – i.e. M, default = 100

Returns

dict(game_name, sum scores, sum max scores, sum steps, # won) sample_mean: total earned scores / total maximum scores confidence_interval: confidence interval of sample_mean over M episodes. steps: total used steps / total maximum steps

Return type

agg_per_game

deepword.eval_games.eval_agent(hp: tensorflow.contrib.training.python.training.hparam.HParams, model_dir: str, load_best: bool, restore_from: str, game_files: List[str], gpu_device: Optional[str] = None) → Tuple[Dict[str, List[deepword.eval_games.EvalResult]], int]¶

Evaluate an agent with given games. For each game, we run nb_episodes, and max_episode_steps for on episode.

Notice that evaluation game running is different with training. In training, we register all given games to TextWorld structure, and play them in a random way. For evaluation, we register one game at a time, and play it for nb_episodes.

Parameters

hp – hyperparameter to create the agent
model_dir – model dir of the agent
load_best – bool, load from best_weights or not (last_weights)
restore_from – string, load from a specific model, e.g. {model_dir}/last_weights/after_epoch-0
game_files – game files for evaluation
gpu_device – which GPU device to load, in a format of “/device:GPU:i”

Returns

eval_results, loaded_ckpt_step

deepword.eval_games.scores_of_tiers(agg_per_game: Dict[str, deepword.eval_games.EvalResult]) → Dict[str, float]¶

Compute scores per tier given aggregated scores per game

Parameters: agg_per_game – Aggregated results per game
Returns: list of tier-name -> scores, starting from tier1 to tier6

deepword.floor_plan module¶

class deepword.floor_plan.FloorPlanCollector¶

Bases: deepword.log.Logging

Collect floor plan from games

e.g. going eastward from kitchen is bedroom, then we know: kitchen – east –> bedroom and bedroom – west –> kitchen.

add_new_episode(eid)¶

extend(fps)¶

get_map(room)¶

init()¶

load_fps(path)¶

route_to_kitchen(room)¶

classmethod route_to_room(ss, tt, fp, visited)¶: find the fastest route to a target room from a given room using DFS. :param ss: start room :param tt: target room :param fp: floor plan :param visited: initialized by [] :return: directions, rooms

save_fps(path)¶

deepword.hparams module¶

class deepword.hparams.Conventions(logo_file, bert_ckpt_dir, bert_vocab_file, nltk_vocab_file, glove_vocab_file, glove_emb_file, legacy_zork_vocab_file, albert_ckpt_dir, albert_vocab_file, albert_spm_path, bert_cls_token, bert_unk_token, bert_padding_token, bert_sep_token, bert_mask_token, bert_sos_token, bert_eos_token, albert_cls_token, albert_unk_token, albert_padding_token, albert_sep_token, albert_mask_token, nltk_unk_token, nltk_padding_token, nltk_sos_token, nltk_eos_token)¶: Bases: deepword.hparams.Conventions

deepword.hparams.copy_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams) → tensorflow.contrib.training.python.training.hparam.HParams¶: Deepcopy for hp

deepword.hparams.get_model_hparams(model_creator: str) → tensorflow.contrib.training.python.training.hparam.HParams¶

deepword.hparams.has_valid_val(dict_args: Optional[Dict[str, Any]], key: str) → bool¶

if dict_args exists
if key in dict_args
if dict_args[key] is not None

deepword.hparams.load_hparams(fn_model_config: Optional[str] = None, cmd_args: Optional[Dict[str, Any]] = None, fn_pre_config: Optional[str] = None) → tensorflow.contrib.training.python.training.hparam.HParams¶

load hyper-parameters priority(file_args) > priority(cmd_args) except arg in allowed_to_change priority(cmd_args) > priority(pre_config) priority(pre_config) > priority(default)

Parameters

fn_model_config – hyperparameter config file in model_dir
cmd_args – command line arguments
fn_pre_config – pre config file for model

Returns

hp

deepword.hparams.output_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams) → str¶: pretty print str in a table style for hp

deepword.hparams.save_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams, file_path: str) → None¶: Save hyperparameters to a json file

deepword.hparams.update_hparams_from_dict(hp: tensorflow.contrib.training.python.training.hparam.HParams, cmd_args: Dict[str, Any], allowed_to_change: Optional[Iterable[str]] = None) → tensorflow.contrib.training.python.training.hparam.HParams¶

update hp from a dict :param hp: hyperparameters :param cmd_args: command line arguments :param allowed_to_change: keys that are allowed to update

Returns: a hp

deepword.hparams.update_hparams_from_file(hp: tensorflow.contrib.training.python.training.hparam.HParams, file_args: str) → tensorflow.contrib.training.python.training.hparam.HParams¶: update hp from a json file

deepword.hparams.update_hparams_from_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams, hp2: tensorflow.contrib.training.python.training.hparam.HParams) → tensorflow.contrib.training.python.training.hparam.HParams¶: update hp from hp2 hp should not have same keys with hp2

deepword.log module¶

class deepword.log.Logging(name: Optional[str] = None)¶

Bases: object

Logging utils for classes

__init__(name: Optional[str] = None)¶

Parameters: name – name for logging, default module_name.class_name

debug(msg, *args, **kwargs)¶

error(msg, *args, **kwargs)¶

info(msg, *args, **kwargs)¶

warning(msg, *args, **kwargs)¶

deepword.main module¶

deepword.main.eval_one_ckpt(hp, model_dir, data_path, learner_clazz, device, ckpt_path)¶

deepword.main.get_parser() → argparse.ArgumentParser¶: Get arg parser for different modules

deepword.main.hp_parser() → argparse.ArgumentParser¶: Arg parser for hyper-parameters

deepword.main.main(args)¶

deepword.main.process_eval_dqn(args)¶: Evaluate dqn models

deepword.main.process_eval_student(args)¶: Evaluate student models

deepword.main.process_gen_data(args)¶: Generate training data from a teacher model

deepword.main.process_hp(args) → tensorflow.contrib.training.python.training.hparam.HParams¶

Load hyperparameters from three location 1. config file in model_dir 2. pre config files 3. cmd line args

Parameters: args – cmd line args
Returns: hyperparameters

deepword.main.process_snn_input(args)¶: generate snn input

deepword.main.process_train_dqn(args)¶: Train DQN models

deepword.main.process_train_student(args)¶: Train student models

deepword.main.run_agent(agent: deepword.agents.base_agent.BaseAgent, game_env: gym.core.Env, nb_games: int, nb_epochs: int) → None¶

Run a train agent on given games.

Parameters

agent – an agent extends the base agent, see deepword.agents.base_agent.BaseAgent.
game_env – game Env, from gym
nb_games – number of games
nb_epochs – number of epochs for training

deepword.main.run_agent_v2(agent: deepword.agents.base_agent.BaseAgent, game_env: gym.core.Env, nb_games: int, nb_epochs: int) → None¶

Run a train agent on given games. Proactively request look and inventory results from games to substitute the description and inventory parts of infos. This is useful when games don’t provide description and inventory, e.g. for Z-machine games.

NB: This will incur extra steps for game playing, remember to use 3-times of previous step quota. E.g. previously use 100 max steps, now you need 100 * 3 max steps.

See deepword.main.run_agent()

deepword.main.train(hp: tensorflow.contrib.training.python.training.hparam.HParams, model_dir: str, game_dir: str, f_games: Optional[str] = None, func_run_agent: Callable[[deepword.agents.base_agent.BaseAgent, gym.core.Env, int, int], None] = <function run_agent>) → None¶

train an agent

Parameters

hp – hyper-parameters see deepword.hparams
model_dir – model dir
game_dir – game dir with ulx games
f_games – game name to select from game_dir
func_run_agent – how to run the agent and games, see deepword.main.run_agent()

deepword.main.train_v2(hp, model_dir, game_dir, f_games=None)¶

Train DQN agents by proactively requesting description and inventory

max step per episode will be enlarged by 3-times.

see deepword.main.train()

deepword.stats module¶

class deepword.stats.UCBComputer(d_states: int, d_actions: int)¶

Bases: object

Compute the Upper Confidence Bound actions during game playing at inference time only, when hidden states are fixed.

The Abbasi-Yadkori, Pal, and Szepesvari bound for LinUCB.

Cite: Improved Algorithms for Linear Stochastic Bandits,: (Abbasi-Yadkori, Pal, and Szepesvari, 2011)

See also: Learn What Not to Learn, (Tom Zahavy et al., 2019)

We use the APS bound.

__init__(d_states: int, d_actions: int)¶

V: covariance matrix for each action a lam: lambda to control parameter size in ridge regression r: R-sub-Gaussian s: bound for |theta_a|_2 delta: with probability of 1 - delta, we have the bound

Parameters

d_states – dimension of hidden states
d_actions – number of actions

aps_bound(q_actions: numpy.ndarray, h_state: numpy.ndarray) → float¶

Compute APS bound

Parameters

q_actions – Q-vector of actions
h_state – hidden state of a game state

Returns: upper confidence bound of q_actions.

collect_sample(action_idx: int, h_state: numpy.ndarray) → None¶

Collect state-action pairs

Parameters

action_idx – action index
h_state – hidden state vector

reset() → None¶: Reset to accept new episodes

deepword.stats.mean_confidence_interval(data: numpy.ndarray, confidence: float = 0.95) → Tuple[float, float]¶

Given data, 1D np array, compute the mean and confidence intervals given confidence level.

Parameters

data – 1D np array
confidence – confidence level

Returns

mean and confidence interval

deepword.sum_tree module¶

This SumTree code is modified version of Morvan Zhou: https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/master/contents/5.2_Prioritized_Replay_DQN/RL_brain.py

class deepword.sum_tree.SumTree(*args, **kwds)¶

Bases: typing.Generic

The SumTree is a binary tree, with leaf nodes containing the real data.

add(priority: float, data: E)¶

Add an experience into the tree with a priority

Parameters

priority – priority of sampling
data – experience of the replay

Returns

old data at the same position, 0 if unset

get_leaf(v: float) → Tuple[int, float, E]¶

Get a leaf_index w.r.t. a priority value the selected leaf_index must have the smallest priority among all leaves that have larger priority values than v.

Parameters: v – a priority value
Returns: leaf index, priority, and experience associated with the leaf index

property total_priority¶: The total priority is the value on the root node.

update(tree_index: int, priority: float) → None¶

Update the leaf priority score and propagate the change through tree

Parameters

tree_index – tree index of the current data_pointer
priority – priority sampling value

deepword.tokenizers module¶

class deepword.tokenizers.AlbertTokenizer(vocab_file, do_lower_case, spm_model_file)¶

Bases: deepword.tokenizers.BertTokenizer

The tokenizer from Albert

de_tokenize(ids)¶

turn a list of ids

Parameters: ids – ids of tokens
Returns: a string

class deepword.tokenizers.BertTokenizer(vocab_file, do_lower_case)¶

Bases: deepword.tokenizers.Tokenizer

The tokenizer from BERT

convert_ids_to_tokens(ids)¶

convert ids to tokens

Parameters: ids – a list of ids
Returns: a list of tokens

convert_tokens_to_ids(tokens)¶

convert tokens into ids

Parameters: tokens – a list of tokens
Returns: a list of ids

de_tokenize(ids)¶

turn a list of ids

Parameters: ids – ids of tokens
Returns: a string

property inv_vocab¶

inverse of vocabulary

Returns: map from positions (ids) to tokens

tokenize(text)¶

tokenize a text into a list of tokens

Parameters: text – a string to tokenize
Returns: a list of tokens

property vocab¶

get the vocabulary

Returns: map from tokens to positions (ids)

class deepword.tokenizers.LegacyZorkTokenizer(vocab_file)¶

Bases: deepword.tokenizers.NLTKTokenizer

the NLTK tokenizer but keep only alphabetic strings by removing all other tokens w/ non-alphabetic characters.

tokenize(text)¶

tokenize a text into a list of tokens

Parameters: text – a string to tokenize
Returns: a list of tokens

class deepword.tokenizers.NLTKTokenizer(vocab_file, do_lower_case)¶

Bases: deepword.tokenizers.Tokenizer

wrapper of the tokenizer from NLTK package

convert_ids_to_tokens(ids)¶

convert ids to tokens

Parameters: ids – a list of ids
Returns: a list of tokens

convert_tokens_to_ids(tokens)¶

convert tokens into ids

Parameters: tokens – a list of tokens
Returns: a list of ids

de_tokenize(ids: List[int]) → str¶

turn a list of ids

Parameters: ids – ids of tokens
Returns: a string

property inv_vocab¶

inverse of vocabulary

Returns: map from positions (ids) to tokens

tokenize(text)¶

tokenize a text into a list of tokens

Parameters: text – a string to tokenize
Returns: a list of tokens

property vocab¶

get the vocabulary

Returns: map from tokens to positions (ids)

class deepword.tokenizers.Tokenizer¶

Bases: object

A wrapper of tokenizer

convert_ids_to_tokens(ids: List[int]) → List[str]¶

convert ids to tokens

Parameters: ids – a list of ids
Returns: a list of tokens

convert_tokens_to_ids(tokens: List[str]) → List[int]¶

convert tokens into ids

Parameters: tokens – a list of tokens
Returns: a list of ids

de_tokenize(ids: List[int]) → str¶

turn a list of ids

Parameters: ids – ids of tokens
Returns: a string

property inv_vocab¶

inverse of vocabulary

Returns: map from positions (ids) to tokens

tokenize(text: str) → List[str]¶

tokenize a text into a list of tokens

Parameters: text – a string to tokenize
Returns: a list of tokens

property vocab¶

get the vocabulary

Returns: map from tokens to positions (ids)

deepword.tokenizers.get_albert_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶

deepword.tokenizers.get_bert_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶

deepword.tokenizers.get_nltk_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams, vocab_file: str = '/Users/xusenyin/git-store/deepword/python/deepword/../../resources/vocab.txt') → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶

deepword.tokenizers.get_zork_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams, vocab_file: str = '/Users/xusenyin/local/opt/legacy-zork-vocab.txt') → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶

deepword.tokenizers.init_tokens(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶

Initialize a tokenizer given hyperparameters

Parameters: hp – hyperparameters, see deepword.hparams
Returns: updated hp, tokenizer

deepword.trajectory module¶

class deepword.trajectory.Trajectory(*args, **kwds)¶

Bases: typing.Generic, deepword.log.Logging

BaseTrajectory only takes care of interacting with Agent on collecting game scripts. Fetching data from Trajectory feeding into Encoder should be implemented in extended classes.

__init__(num_turns: int, size_per_turn: int = 1)¶

Take the ActionMaster (AM) as an example, Trajectory(AM1, AM2, AM3, AM4, AM5), and last_sid points to AM5; num_turns = 1 means we choose [AM5];

size_per_turn only controls the way we separate pre- and post-trajectory default with size_per_turn = 1, AM4 is the pre-trajectory of AM5.

Sometimes we need to change it, e.g. with legacy data where we store trajectory as Trajectory(M1, A1, M2, A2, M3, A3, M4), and the last sid points to M4, then the pre-trajectory of [A3, M4] is [A2, M3], that’s why the size_per_turn should set to 2.

Parameters

num_turns – how many turns to choose other than current turn
size_per_turn – how many cells count as one turn

add_new_tj(tid: Optional[int] = None) → int¶

Add a new trajectory

Parameters: tid – trajectory id, None falls back to auto-generated id.
Returns: a tid

append(content: T) → None¶

Use generic type for content the trajectory class doesn’t care what has been stored and also doesn’t process them

Parameters: content – something to add in the current trajectory

fetch_batch_pre_states(b_tid: List[int], b_sid: List[int]) → List[List[T]]¶

Fetch a batch of pre-states given trajectory ids and state ids

the position of pre-states is depend on size_per_turn.

Parameters

b_tid – a batch of trajectory ids
b_sid – a batch of state ids

Returns

a list of lists of contents

fetch_batch_states(b_tid: List[int], b_sid: List[int]) → List[List[T]]¶

Fetch a batch of states given trajectory ids and state ids.

Parameters

b_tid – a batch of trajectory ids
b_sid – a batch of state ids

Returns

a list of lists of contents

fetch_last_state() → List[T]¶

Fetch the last state from the current trajectory

Returns: a list of contents

fetch_state_by_idx(tid: int, sid: int) → List[T]¶

fetch a state given trajectory id and state id

Returns: a list of contents

get_current_tid() → int¶: Get current trajectory id

get_last_sid() → int¶: A state is defined as a series of interactions between a game and an agent ended with the game’s last response. e.g. “G0, A1, G2, A3, G4” is a state ended with the game’s last response named “G4”.

load_tjs(path: str) → None¶: Load trajectories from a npz file

request_delete_keys(ks: List[int]) → Dict[int, List[T]]¶

Request to delete all trajectories with keys in ks.

Parameters: ks – a list of keys of trajectories to be deleted

save_tjs(path: str) → None¶

Save all trajectories in a npz file

All trajectory ids, all trajectories, current trajectory id, and current trajectory will be saved.

deepword.tree_memory module¶

This SumTree code is modified version and the original code is from: https://github.com/jaara/AI-blog/blob/master/Seaquest-DDQN-PER.py

class deepword.tree_memory.TreeMemory(*args, **kwds)¶

Bases: deepword.log.Logging, typing.Generic

TreeMemory to store and sample experiences to replay.

append(experience: E) → E¶

New experiences have a score of max priority over all leaves to make sure to be sampled.

Parameters: experience – a new experience
Returns: previous experience in the same position

batch_update(tree_idx: numpy.ndarray, abs_errors: numpy.ndarray) → None¶

Update the priorities on the tree

Parameters

tree_idx – an array of index (int)
abs_errors – an array of abs errors (float)

load_memo(path: str) → None¶

load memo only apply to a new memo without any appending. load memo will change previous tree structure.

Parameters: path – a npz file to load

sample_batch(n: int) → Tuple[numpy.ndarray, List[E], numpy.ndarray]¶

Sample a batch of experiences according to priority values

First, to sample a batch of n size, the range [0, priority_total] is divided into n equally ranges.
Then a value is uniformly sampled from each range
We search in the SumTree, the experience where priority score correspond to sample values are retrieved from.
Then, we calculate importance sampling (IS) weights for each element in the batch

Parameters: n – batch size
Returns: tree index, experiences, IS weights

save_memo(path: str) → None¶

Save the memory to a npz file

Parameters: path – path to a npz file

uniform_sample_batch(n: int) → numpy.ndarray¶

Randomly sample a batch of experiences

Parameters: n – batch size
Returns: a batch of experiences

deepword.utils module¶

deepword.utils.agent_name2clazz(name: str)¶

Find the class given the agent name in this package.

Parameters: name – Agent name from deepword.agents
Returns: the class w.r.t. the agent name

deepword.utils.bytes2idx(byte_mask: List[bytes], size: int) → numpy.ndarray¶

load a list of bytes to choose 1 for selected actions

Parameters

byte_mask – a list of bytes
size – the size of total actions

Returns

an np array of indices

deepword.utils.core_name2clazz(name: str)¶

Find the class given the core name in this package.

Parameters: name – Agent name from deepword.agents.cores
Returns: the class w.r.t. the core name

deepword.utils.ctime() → int¶: current time in millisecond

deepword.utils.eprint(*args, **kwargs)¶: print to stderr

deepword.utils.flatmap(f, items)¶: flatmap for python

deepword.utils.flatten(items)¶: flatten a list of lists to a list

deepword.utils.get_hash(txt: str) → str¶: get hex hash value for a string

deepword.utils.get_token2idx(tokens: List[str]) → Dict[str, int]¶: From a list of tokens to a dict of token to position

deepword.utils.learner_name2clazz(name: str)¶

Find the class given the learner name in this package.

Parameters: name – Learner name from deepword.students
Returns: the class w.r.t. the learner name

deepword.utils.load_actions(action_file: str) → List[str]¶: Load unique actions from an action file

deepword.utils.load_and_split(game_path: str, f_games: str) → Tuple[List[str], List[str]]¶

Load games and split train dev set

Parameters

game_path – game dir
f_games – a file with list of games, each game name per line, without the
of ulx (suffix) –

Returns

train_games, dev_games

deepword.utils.load_game_files(game_path: str, f_games: Optional[str] = None) → List[str]¶

Load a dir of games, or a single game. if game_path represents a file, then return a list of the file; if game_path is a dir, then return a list of files in the dir suffixed with

.ulx;

if f_games is set, then load files in the game_path with names listed in: f_games.

Parameters

game_path – a dir, or a single file
f_games – a file of game names, without suffix, default suffix .ulx

Returns

a list of game files

deepword.utils.load_uniq_lines(fname: str) → List[str]¶: Load unique lines from a file, line order preserved

deepword.utils.load_vocab(vocab_file: str) → List[str]¶: Load unique words from a vocabulary

deepword.utils.model_name2clazz(name: str)¶

Find the class given the model name in this package.

Parameters: name – Model name from deepword.models
Returns: the class w.r.t. the model name

deepword.utils.report_status(lst_of_status: List[Tuple[str, Any]]) → str¶

Pretty print a series of k-v pairs

Parameters: lst_of_status – A list of k-v pairs
Returns: a string to print

deepword.utils.setup_eval_log(log_filename: str)¶

Setup log for evaluation

Parameters: log_filename – the path to log file

deepword.utils.setup_logging(default_path: str = 'logging.yaml', default_level: int = 20, env_key: str = 'LOG_CFG', local_log_filename: Optional[str] = None) → None¶

Setup logging for python project

Load YAML config file from default_path, or from the environment variable set by env_key. Falls back to default config if file not exist.

if local_log_filename set, add a local rotating log file.

deepword.utils.setup_train_log(model_dir: str)¶: Setup log for training by putting a game_script.log in model_dir.

deepword.utils.softmax(x: numpy.ndarray) → numpy.ndarray¶: numerical stability softmax

deepword.utils.split_train_dev(game_files: List[str], train_ratio: float = 0.9, rnd_seed: int = 42) → Tuple[List[str], List[str]]¶

Split train/dev sets from given game files sort - shuffle w/ Random(42) - split

Parameters

game_files – game files
train_ratio – the percentage of training files
rnd_seed – for randomly shuffle files, default = 42

Returns

train_games, dev_games

Exception:: empty game_files

deepword.utils.uniq(lst)¶: order-preserving unique

deepword package¶

Subpackages¶

Submodules¶

deepword.action module¶

deepword.dependency_parser module¶

deepword.eval_games module¶

deepword.floor_plan module¶

deepword.hparams module¶

deepword.log module¶

deepword.main module¶

deepword.stats module¶

deepword.sum_tree module¶

deepword.tokenizers module¶

deepword.trajectory module¶

deepword.tree_memory module¶

deepword.utils module¶

Module contents¶

Table of Contents

This Page