deepword package¶
Subpackages¶
- deepword.agents package
- deepword.models package
- Submodules
- deepword.models.dqn_modeling module
- deepword.models.drrn_modeling module
- deepword.models.dsqn_modeling module
- deepword.models.gen_modeling module
- deepword.models.models module
- deepword.models.nlu_modeling module
- deepword.models.snn_modeling module
- deepword.models.transformer module
- deepword.models.utils module
- Module contents
- deepword.students package
- deepword.tests package
- deepword.tools package
- Submodules
- deepword.tools.build_test_set module
- deepword.tools.clean_hs2tj module
- deepword.tools.collect_game_elements module
- deepword.tools.compare_eval_results module
- deepword.tools.diff_train_w_test module
- deepword.tools.play_game module
- deepword.tools.read_eval_failed_reason module
- deepword.tools.read_eval_results module
- deepword.tools.replay_from_log module
- Module contents
Submodules¶
deepword.action module¶
-
class
deepword.action.ActionCollector(tokenizer: deepword.tokenizers.Tokenizer, n_tokens: int, unk_val_id: int, padding_val_id: int)¶ Bases:
deepword.log.LoggingCollect actions for different games
-
__init__(tokenizer: deepword.tokenizers.Tokenizer, n_tokens: int, unk_val_id: int, padding_val_id: int) → None¶ - Parameters
tokenizer – see
deepword.tokenizersn_tokens – max allowed number of tokens for all actions
unk_val_id – ID of the unknown token
padding_val_id – ID of the padding token
-
property
action2idx¶ Current action to token IDs
-
property
action_len¶ Current action lengths
-
property
action_matrix¶ Current action matrix
-
property
actions¶ Current actions in string.
-
add_new_episode(gid: str) → None¶ Add a new episode with game ID.
- Parameters
gid – game ID, a string that separate different games.
-
extend(actions: List[str]) → numpy.ndarray¶ Extend actions into ActionCollector.
- Parameters
actions – a list of actions for current episode of game-playing.
-
get_action_len(gid: Optional[str] = None) → numpy.ndarray¶ Get action lengths for a game
- Parameters
gid – game ID, None falls back to current episode of game.
- Returns
an array of int, each element is a length for that action
-
get_action_matrix(gid: Optional[str] = None) → numpy.ndarray¶ Get action matrix for a game.
- Parameters
gid – the game ID. None falls back to current activated game.
- Returns
- an array of actions, each action is a vector of its IDs to tokens
that are filled with padding in the end to reach the same token size.
-
get_actions(gid: Optional[str] = None) → List[str]¶ Get all actions for a game.
- Parameters
gid – game ID, None falls back to current episode of game.
- Returns
a list of actions in string
-
get_game_ids() → List[str]¶ Get all game IDs in this ActionCollector.
-
load_actions(path: str) → None¶ Load all actions in this ActionCollector
- Parameters
path – a path to a npz file.
-
save_actions(path: str) → None¶ Save all actions to a path as a npz file.
- Parameters
path – a npz path to save
-
deepword.dependency_parser module¶
-
class
deepword.dependency_parser.DependencyParserReorder(padding_val: str, stride_len: int)¶ Bases:
deepword.log.LoggingUse dependency parser to reorder master sentences. Make sure to open Stanford CoreNLP server first. Refer to https://stanfordnlp.github.io/CoreNLP/corenlp-server.html
The DP reorder class is used with CNN layers for trajectory encoding. Refer to https://arxiv.org/abs/1905.02265 for details.
-
__init__(padding_val: str, stride_len: int) → None¶ - Parameters
padding_val – padding token, e.g. ‘[PAD]’ or ‘O’
stride_len – CNN stride len
-
reorder(master: str) → str¶ Use dependency parser to reorder a paragraph.
-
deepword.eval_games module¶
-
class
deepword.eval_games.EvalResult(score, positive_score, negative_score, max_score, steps, won, action_list)¶
-
class
deepword.eval_games.FullDirEvalPlayer¶ Bases:
deepword.log.Logging-
classmethod
start(hp, model_dir, game_files, n_gpus, range_min=None, range_max=None)¶
-
classmethod
-
class
deepword.eval_games.LoopDogEvalPlayer¶ Bases:
deepword.log.Logging-
start(hp, model_dir, game_files, n_gpus)¶
-
-
class
deepword.eval_games.MultiGPUsEvalPlayer(hp, model_dir, game_files, n_gpus, load_best=True)¶ Bases:
deepword.log.LoggingEval Player that runs on multiple GPUs
-
evaluate(restore_from: str, debug: bool = False) → None¶ Evaluate an agent
- Parameters
restore_from – path to restore weights
debug – if debug, multi-threads will be disabled
-
has_better_model(total_scores: float, total_steps: float) → bool¶ Whether or not the current model is better
- Parameters
total_scores – total scores earned
total_steps – total steps used
-
save_best_model(loaded_ckpt_step: int) → None¶ Copy current model to the best model dir
- Parameters
loaded_ckpt_step – which model to copy
-
classmethod
split_game_files(game_files: List[str], k: int, rnd_seed: int = 42) → List[List[str]]¶ Split game files into k portions for multi GPUs playing
- Parameters
game_files – a list of games for playing
k – number of splits
rnd_seed – random seed
- Returns
a list of list of game files
-
-
class
deepword.eval_games.NewModelHandler(hp, model_dir, game_files, n_gpus)¶ Bases:
watchdog.events.FileSystemEventHandler-
is_ckpt_file(src_path)¶
-
on_created(event)¶ Called when a file or directory is created.
- Parameters
event (
DirCreatedEventorFileCreatedEvent) – Event representing file/directory creation.
-
on_modified(event)¶ Called when a file or directory is modified.
- Parameters
event (
DirModifiedEventorFileModifiedEvent) – Event representing file/directory modification.
-
run_eval_player(restore_from=None, load_best=False)¶
-
-
class
deepword.eval_games.WatchDogEvalPlayer¶ Bases:
deepword.log.Logging-
start(hp, model_dir, game_files, n_gpus)¶
-
-
deepword.eval_games.agent_collect_data(agent, game_files, max_episode_steps, epoch_size, epoch_limit)¶
-
deepword.eval_games.agg_eval_results(eval_results: Dict[str, List[deepword.eval_games.EvalResult]], max_steps_per_episode: int = 100) → Tuple[Dict[str, deepword.eval_games.EvalResult], float, float, float, float, float, float]¶ Aggregate evaluation results. We run N test games, each with M episodes, each episode has a maximum of K steps.
- Parameters
eval_results –
evaluation results of text-based games, in the following format:
dict(game_name, [eval_result1, …, evaluate_resultM]) and the number of eval_results are the same for all games. evaluate_result:
- score, positive_score, negative_score, max_score, steps,
won (bool), used_action_list
max_steps_per_episode – i.e. M, default = 100
- Returns
dict(game_name, sum scores, sum max scores, sum steps, # won) sample_mean: total earned scores / total maximum scores confidence_interval: confidence interval of sample_mean over M episodes. steps: total used steps / total maximum steps
- Return type
agg_per_game
-
deepword.eval_games.eval_agent(hp: tensorflow.contrib.training.python.training.hparam.HParams, model_dir: str, load_best: bool, restore_from: str, game_files: List[str], gpu_device: Optional[str] = None) → Tuple[Dict[str, List[deepword.eval_games.EvalResult]], int]¶ Evaluate an agent with given games. For each game, we run nb_episodes, and max_episode_steps for on episode.
Notice that evaluation game running is different with training. In training, we register all given games to TextWorld structure, and play them in a random way. For evaluation, we register one game at a time, and play it for nb_episodes.
- Parameters
hp – hyperparameter to create the agent
model_dir – model dir of the agent
load_best – bool, load from best_weights or not (last_weights)
restore_from – string, load from a specific model, e.g. {model_dir}/last_weights/after_epoch-0
game_files – game files for evaluation
gpu_device – which GPU device to load, in a format of “/device:GPU:i”
- Returns
eval_results, loaded_ckpt_step
-
deepword.eval_games.scores_of_tiers(agg_per_game: Dict[str, deepword.eval_games.EvalResult]) → Dict[str, float]¶ Compute scores per tier given aggregated scores per game
- Parameters
agg_per_game – Aggregated results per game
- Returns
list of tier-name -> scores, starting from tier1 to tier6
deepword.floor_plan module¶
-
class
deepword.floor_plan.FloorPlanCollector¶ Bases:
deepword.log.LoggingCollect floor plan from games
- e.g. going eastward from kitchen is bedroom, then we know
kitchen – east –> bedroom and bedroom – west –> kitchen.
-
add_new_episode(eid)¶
-
extend(fps)¶
-
get_map(room)¶
-
init()¶
-
load_fps(path)¶
-
route_to_kitchen(room)¶
-
classmethod
route_to_room(ss, tt, fp, visited)¶ find the fastest route to a target room from a given room using DFS. :param ss: start room :param tt: target room :param fp: floor plan :param visited: initialized by [] :return: directions, rooms
-
save_fps(path)¶
deepword.hparams module¶
-
class
deepword.hparams.Conventions(logo_file, bert_ckpt_dir, bert_vocab_file, nltk_vocab_file, glove_vocab_file, glove_emb_file, legacy_zork_vocab_file, albert_ckpt_dir, albert_vocab_file, albert_spm_path, bert_cls_token, bert_unk_token, bert_padding_token, bert_sep_token, bert_mask_token, bert_sos_token, bert_eos_token, albert_cls_token, albert_unk_token, albert_padding_token, albert_sep_token, albert_mask_token, nltk_unk_token, nltk_padding_token, nltk_sos_token, nltk_eos_token)¶ Bases:
deepword.hparams.Conventions
-
deepword.hparams.copy_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams) → tensorflow.contrib.training.python.training.hparam.HParams¶ Deepcopy for hp
-
deepword.hparams.get_model_hparams(model_creator: str) → tensorflow.contrib.training.python.training.hparam.HParams¶
-
deepword.hparams.has_valid_val(dict_args: Optional[Dict[str, Any]], key: str) → bool¶ if dict_args exists
if key in dict_args
if dict_args[key] is not None
-
deepword.hparams.load_hparams(fn_model_config: Optional[str] = None, cmd_args: Optional[Dict[str, Any]] = None, fn_pre_config: Optional[str] = None) → tensorflow.contrib.training.python.training.hparam.HParams¶ load hyper-parameters priority(file_args) > priority(cmd_args) except arg in allowed_to_change priority(cmd_args) > priority(pre_config) priority(pre_config) > priority(default)
- Parameters
fn_model_config – hyperparameter config file in model_dir
cmd_args – command line arguments
fn_pre_config – pre config file for model
- Returns
hp
-
deepword.hparams.output_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams) → str¶ pretty print str in a table style for hp
-
deepword.hparams.save_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams, file_path: str) → None¶ Save hyperparameters to a json file
-
deepword.hparams.update_hparams_from_dict(hp: tensorflow.contrib.training.python.training.hparam.HParams, cmd_args: Dict[str, Any], allowed_to_change: Optional[Iterable[str]] = None) → tensorflow.contrib.training.python.training.hparam.HParams¶ update hp from a dict :param hp: hyperparameters :param cmd_args: command line arguments :param allowed_to_change: keys that are allowed to update
- Returns
a hp
-
deepword.hparams.update_hparams_from_file(hp: tensorflow.contrib.training.python.training.hparam.HParams, file_args: str) → tensorflow.contrib.training.python.training.hparam.HParams¶ update hp from a json file
-
deepword.hparams.update_hparams_from_hparams(hp: tensorflow.contrib.training.python.training.hparam.HParams, hp2: tensorflow.contrib.training.python.training.hparam.HParams) → tensorflow.contrib.training.python.training.hparam.HParams¶ update hp from hp2 hp should not have same keys with hp2
deepword.log module¶
-
class
deepword.log.Logging(name: Optional[str] = None)¶ Bases:
objectLogging utils for classes
-
__init__(name: Optional[str] = None)¶ - Parameters
name – name for logging, default module_name.class_name
-
debug(msg, *args, **kwargs)¶
-
error(msg, *args, **kwargs)¶
-
info(msg, *args, **kwargs)¶
-
warning(msg, *args, **kwargs)¶
-
deepword.main module¶
-
deepword.main.eval_one_ckpt(hp, model_dir, data_path, learner_clazz, device, ckpt_path)¶
-
deepword.main.get_parser() → argparse.ArgumentParser¶ Get arg parser for different modules
-
deepword.main.hp_parser() → argparse.ArgumentParser¶ Arg parser for hyper-parameters
-
deepword.main.main(args)¶
-
deepword.main.process_eval_dqn(args)¶ Evaluate dqn models
-
deepword.main.process_eval_student(args)¶ Evaluate student models
-
deepword.main.process_gen_data(args)¶ Generate training data from a teacher model
-
deepword.main.process_hp(args) → tensorflow.contrib.training.python.training.hparam.HParams¶ Load hyperparameters from three location 1. config file in model_dir 2. pre config files 3. cmd line args
- Parameters
args – cmd line args
- Returns
hyperparameters
-
deepword.main.process_snn_input(args)¶ generate snn input
-
deepword.main.process_train_dqn(args)¶ Train DQN models
-
deepword.main.process_train_student(args)¶ Train student models
-
deepword.main.run_agent(agent: deepword.agents.base_agent.BaseAgent, game_env: gym.core.Env, nb_games: int, nb_epochs: int) → None¶ Run a train agent on given games.
- Parameters
agent – an agent extends the base agent, see
deepword.agents.base_agent.BaseAgent.game_env – game Env, from gym
nb_games – number of games
nb_epochs – number of epochs for training
-
deepword.main.run_agent_v2(agent: deepword.agents.base_agent.BaseAgent, game_env: gym.core.Env, nb_games: int, nb_epochs: int) → None¶ Run a train agent on given games. Proactively request look and inventory results from games to substitute the description and inventory parts of infos. This is useful when games don’t provide description and inventory, e.g. for Z-machine games.
NB: This will incur extra steps for game playing, remember to use 3-times of previous step quota. E.g. previously use 100 max steps, now you need 100 * 3 max steps.
-
deepword.main.train(hp: tensorflow.contrib.training.python.training.hparam.HParams, model_dir: str, game_dir: str, f_games: Optional[str] = None, func_run_agent: Callable[[deepword.agents.base_agent.BaseAgent, gym.core.Env, int, int], None] = <function run_agent>) → None¶ train an agent
- Parameters
hp – hyper-parameters see
deepword.hparamsmodel_dir – model dir
game_dir – game dir with ulx games
f_games – game name to select from game_dir
func_run_agent – how to run the agent and games, see
deepword.main.run_agent()
-
deepword.main.train_v2(hp, model_dir, game_dir, f_games=None)¶ Train DQN agents by proactively requesting description and inventory
max step per episode will be enlarged by 3-times.
deepword.stats module¶
-
class
deepword.stats.UCBComputer(d_states: int, d_actions: int)¶ Bases:
objectCompute the Upper Confidence Bound actions during game playing at inference time only, when hidden states are fixed.
- The Abbasi-Yadkori, Pal, and Szepesvari bound for LinUCB.
- Cite: Improved Algorithms for Linear Stochastic Bandits,
(Abbasi-Yadkori, Pal, and Szepesvari, 2011)
See also: Learn What Not to Learn, (Tom Zahavy et al., 2019)
We use the APS bound.
-
__init__(d_states: int, d_actions: int)¶ V: covariance matrix for each action a lam: lambda to control parameter size in ridge regression r: R-sub-Gaussian s: bound for |theta_a|_2 delta: with probability of 1 - delta, we have the bound
- Parameters
d_states – dimension of hidden states
d_actions – number of actions
-
aps_bound(q_actions: numpy.ndarray, h_state: numpy.ndarray) → float¶ Compute APS bound
- Parameters
q_actions – Q-vector of actions
h_state – hidden state of a game state
- Returns
upper confidence bound of q_actions.
-
collect_sample(action_idx: int, h_state: numpy.ndarray) → None¶ Collect state-action pairs
- Parameters
action_idx – action index
h_state – hidden state vector
-
reset() → None¶ Reset to accept new episodes
-
deepword.stats.mean_confidence_interval(data: numpy.ndarray, confidence: float = 0.95) → Tuple[float, float]¶ Given data, 1D np array, compute the mean and confidence intervals given confidence level.
- Parameters
data – 1D np array
confidence – confidence level
- Returns
mean and confidence interval
deepword.sum_tree module¶
This SumTree code is modified version of Morvan Zhou: https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/master/contents/5.2_Prioritized_Replay_DQN/RL_brain.py
-
class
deepword.sum_tree.SumTree(*args, **kwds)¶ Bases:
typing.GenericThe SumTree is a binary tree, with leaf nodes containing the real data.
-
add(priority: float, data: E)¶ Add an experience into the tree with a priority
- Parameters
priority – priority of sampling
data – experience of the replay
- Returns
old data at the same position, 0 if unset
-
get_leaf(v: float) → Tuple[int, float, E]¶ Get a leaf_index w.r.t. a priority value the selected leaf_index must have the smallest priority among all leaves that have larger priority values than v.
- Parameters
v – a priority value
- Returns
leaf index, priority, and experience associated with the leaf index
-
property
total_priority¶ The total priority is the value on the root node.
-
update(tree_index: int, priority: float) → None¶ Update the leaf priority score and propagate the change through tree
- Parameters
tree_index – tree index of the current data_pointer
priority – priority sampling value
-
deepword.tokenizers module¶
-
class
deepword.tokenizers.AlbertTokenizer(vocab_file, do_lower_case, spm_model_file)¶ Bases:
deepword.tokenizers.BertTokenizerThe tokenizer from Albert
-
de_tokenize(ids)¶ turn a list of ids
- Parameters
ids – ids of tokens
- Returns
a string
-
-
class
deepword.tokenizers.BertTokenizer(vocab_file, do_lower_case)¶ Bases:
deepword.tokenizers.TokenizerThe tokenizer from BERT
-
convert_ids_to_tokens(ids)¶ convert ids to tokens
- Parameters
ids – a list of ids
- Returns
a list of tokens
-
convert_tokens_to_ids(tokens)¶ convert tokens into ids
- Parameters
tokens – a list of tokens
- Returns
a list of ids
-
de_tokenize(ids)¶ turn a list of ids
- Parameters
ids – ids of tokens
- Returns
a string
-
property
inv_vocab¶ inverse of vocabulary
- Returns
map from positions (ids) to tokens
-
tokenize(text)¶ tokenize a text into a list of tokens
- Parameters
text – a string to tokenize
- Returns
a list of tokens
-
property
vocab¶ get the vocabulary
- Returns
map from tokens to positions (ids)
-
-
class
deepword.tokenizers.LegacyZorkTokenizer(vocab_file)¶ Bases:
deepword.tokenizers.NLTKTokenizerthe NLTK tokenizer but keep only alphabetic strings by removing all other tokens w/ non-alphabetic characters.
-
tokenize(text)¶ tokenize a text into a list of tokens
- Parameters
text – a string to tokenize
- Returns
a list of tokens
-
-
class
deepword.tokenizers.NLTKTokenizer(vocab_file, do_lower_case)¶ Bases:
deepword.tokenizers.Tokenizerwrapper of the tokenizer from NLTK package
-
convert_ids_to_tokens(ids)¶ convert ids to tokens
- Parameters
ids – a list of ids
- Returns
a list of tokens
-
convert_tokens_to_ids(tokens)¶ convert tokens into ids
- Parameters
tokens – a list of tokens
- Returns
a list of ids
-
de_tokenize(ids: List[int]) → str¶ turn a list of ids
- Parameters
ids – ids of tokens
- Returns
a string
-
property
inv_vocab¶ inverse of vocabulary
- Returns
map from positions (ids) to tokens
-
tokenize(text)¶ tokenize a text into a list of tokens
- Parameters
text – a string to tokenize
- Returns
a list of tokens
-
property
vocab¶ get the vocabulary
- Returns
map from tokens to positions (ids)
-
-
class
deepword.tokenizers.Tokenizer¶ Bases:
objectA wrapper of tokenizer
-
convert_ids_to_tokens(ids: List[int]) → List[str]¶ convert ids to tokens
- Parameters
ids – a list of ids
- Returns
a list of tokens
-
convert_tokens_to_ids(tokens: List[str]) → List[int]¶ convert tokens into ids
- Parameters
tokens – a list of tokens
- Returns
a list of ids
-
de_tokenize(ids: List[int]) → str¶ turn a list of ids
- Parameters
ids – ids of tokens
- Returns
a string
-
property
inv_vocab¶ inverse of vocabulary
- Returns
map from positions (ids) to tokens
-
tokenize(text: str) → List[str]¶ tokenize a text into a list of tokens
- Parameters
text – a string to tokenize
- Returns
a list of tokens
-
property
vocab¶ get the vocabulary
- Returns
map from tokens to positions (ids)
-
-
deepword.tokenizers.get_albert_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶
-
deepword.tokenizers.get_bert_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶
-
deepword.tokenizers.get_nltk_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams, vocab_file: str = '/Users/xusenyin/git-store/deepword/python/deepword/../../resources/vocab.txt') → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶
-
deepword.tokenizers.get_zork_tokenizer(hp: tensorflow.contrib.training.python.training.hparam.HParams, vocab_file: str = '/Users/xusenyin/local/opt/legacy-zork-vocab.txt') → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶
-
deepword.tokenizers.init_tokens(hp: tensorflow.contrib.training.python.training.hparam.HParams) → Tuple[tensorflow.contrib.training.python.training.hparam.HParams, deepword.tokenizers.Tokenizer]¶ Initialize a tokenizer given hyperparameters
- Parameters
hp – hyperparameters, see
deepword.hparams- Returns
updated hp, tokenizer
deepword.trajectory module¶
-
class
deepword.trajectory.Trajectory(*args, **kwds)¶ Bases:
typing.Generic,deepword.log.LoggingBaseTrajectory only takes care of interacting with Agent on collecting game scripts. Fetching data from Trajectory feeding into Encoder should be implemented in extended classes.
-
__init__(num_turns: int, size_per_turn: int = 1)¶ Take the ActionMaster (AM) as an example, Trajectory(AM1, AM2, AM3, AM4, AM5), and last_sid points to AM5; num_turns = 1 means we choose [AM5];
size_per_turn only controls the way we separate pre- and post-trajectory default with size_per_turn = 1, AM4 is the pre-trajectory of AM5.
Sometimes we need to change it, e.g. with legacy data where we store trajectory as Trajectory(M1, A1, M2, A2, M3, A3, M4), and the last sid points to M4, then the pre-trajectory of [A3, M4] is [A2, M3], that’s why the size_per_turn should set to 2.
- Parameters
num_turns – how many turns to choose other than current turn
size_per_turn – how many cells count as one turn
-
add_new_tj(tid: Optional[int] = None) → int¶ Add a new trajectory
- Parameters
tid – trajectory id, None falls back to auto-generated id.
- Returns
a tid
-
append(content: T) → None¶ Use generic type for content the trajectory class doesn’t care what has been stored and also doesn’t process them
- Parameters
content – something to add in the current trajectory
-
fetch_batch_pre_states(b_tid: List[int], b_sid: List[int]) → List[List[T]]¶ Fetch a batch of pre-states given trajectory ids and state ids
the position of pre-states is depend on size_per_turn.
- Parameters
b_tid – a batch of trajectory ids
b_sid – a batch of state ids
- Returns
a list of lists of contents
-
fetch_batch_states(b_tid: List[int], b_sid: List[int]) → List[List[T]]¶ Fetch a batch of states given trajectory ids and state ids.
- Parameters
b_tid – a batch of trajectory ids
b_sid – a batch of state ids
- Returns
a list of lists of contents
-
fetch_last_state() → List[T]¶ Fetch the last state from the current trajectory
- Returns
a list of contents
-
fetch_state_by_idx(tid: int, sid: int) → List[T]¶ fetch a state given trajectory id and state id
- Returns
a list of contents
-
get_current_tid() → int¶ Get current trajectory id
-
get_last_sid() → int¶ A state is defined as a series of interactions between a game and an agent ended with the game’s last response. e.g. “G0, A1, G2, A3, G4” is a state ended with the game’s last response named “G4”.
-
load_tjs(path: str) → None¶ Load trajectories from a npz file
-
request_delete_keys(ks: List[int]) → Dict[int, List[T]]¶ Request to delete all trajectories with keys in ks.
- Parameters
ks – a list of keys of trajectories to be deleted
-
save_tjs(path: str) → None¶ Save all trajectories in a npz file
All trajectory ids, all trajectories, current trajectory id, and current trajectory will be saved.
-
deepword.tree_memory module¶
This SumTree code is modified version and the original code is from: https://github.com/jaara/AI-blog/blob/master/Seaquest-DDQN-PER.py
-
class
deepword.tree_memory.TreeMemory(*args, **kwds)¶ Bases:
deepword.log.Logging,typing.GenericTreeMemory to store and sample experiences to replay.
-
append(experience: E) → E¶ New experiences have a score of max priority over all leaves to make sure to be sampled.
- Parameters
experience – a new experience
- Returns
previous experience in the same position
-
batch_update(tree_idx: numpy.ndarray, abs_errors: numpy.ndarray) → None¶ Update the priorities on the tree
- Parameters
tree_idx – an array of index (int)
abs_errors – an array of abs errors (float)
-
load_memo(path: str) → None¶ load memo only apply to a new memo without any appending. load memo will change previous tree structure.
- Parameters
path – a npz file to load
-
sample_batch(n: int) → Tuple[numpy.ndarray, List[E], numpy.ndarray]¶ Sample a batch of experiences according to priority values
First, to sample a batch of n size, the range [0, priority_total] is divided into n equally ranges.
Then a value is uniformly sampled from each range
We search in the SumTree, the experience where priority score correspond to sample values are retrieved from.
Then, we calculate importance sampling (IS) weights for each element in the batch
- Parameters
n – batch size
- Returns
tree index, experiences, IS weights
-
save_memo(path: str) → None¶ Save the memory to a npz file
- Parameters
path – path to a npz file
-
uniform_sample_batch(n: int) → numpy.ndarray¶ Randomly sample a batch of experiences
- Parameters
n – batch size
- Returns
a batch of experiences
-
deepword.utils module¶
-
deepword.utils.agent_name2clazz(name: str)¶ Find the class given the agent name in this package.
- Parameters
name – Agent name from
deepword.agents- Returns
the class w.r.t. the agent name
-
deepword.utils.bytes2idx(byte_mask: List[bytes], size: int) → numpy.ndarray¶ load a list of bytes to choose 1 for selected actions
- Parameters
byte_mask – a list of bytes
size – the size of total actions
- Returns
an np array of indices
-
deepword.utils.core_name2clazz(name: str)¶ Find the class given the core name in this package.
- Parameters
name – Agent name from
deepword.agents.cores- Returns
the class w.r.t. the core name
-
deepword.utils.ctime() → int¶ current time in millisecond
-
deepword.utils.eprint(*args, **kwargs)¶ print to stderr
-
deepword.utils.flatmap(f, items)¶ flatmap for python
-
deepword.utils.flatten(items)¶ flatten a list of lists to a list
-
deepword.utils.get_hash(txt: str) → str¶ get hex hash value for a string
-
deepword.utils.get_token2idx(tokens: List[str]) → Dict[str, int]¶ From a list of tokens to a dict of token to position
-
deepword.utils.learner_name2clazz(name: str)¶ Find the class given the learner name in this package.
- Parameters
name – Learner name from
deepword.students- Returns
the class w.r.t. the learner name
-
deepword.utils.load_actions(action_file: str) → List[str]¶ Load unique actions from an action file
-
deepword.utils.load_and_split(game_path: str, f_games: str) → Tuple[List[str], List[str]]¶ Load games and split train dev set
- Parameters
game_path – game dir
f_games – a file with list of games, each game name per line, without the
of ulx (suffix) –
- Returns
train_games, dev_games
-
deepword.utils.load_game_files(game_path: str, f_games: Optional[str] = None) → List[str]¶ Load a dir of games, or a single game. if game_path represents a file, then return a list of the file; if game_path is a dir, then return a list of files in the dir suffixed with
.ulx;
- if f_games is set, then load files in the game_path with names listed in
f_games.
- Parameters
game_path – a dir, or a single file
f_games – a file of game names, without suffix, default suffix .ulx
- Returns
a list of game files
-
deepword.utils.load_uniq_lines(fname: str) → List[str]¶ Load unique lines from a file, line order preserved
-
deepword.utils.load_vocab(vocab_file: str) → List[str]¶ Load unique words from a vocabulary
-
deepword.utils.model_name2clazz(name: str)¶ Find the class given the model name in this package.
- Parameters
name – Model name from
deepword.models- Returns
the class w.r.t. the model name
-
deepword.utils.report_status(lst_of_status: List[Tuple[str, Any]]) → str¶ Pretty print a series of k-v pairs
- Parameters
lst_of_status – A list of k-v pairs
- Returns
a string to print
-
deepword.utils.setup_eval_log(log_filename: str)¶ Setup log for evaluation
- Parameters
log_filename – the path to log file
-
deepword.utils.setup_logging(default_path: str = 'logging.yaml', default_level: int = 20, env_key: str = 'LOG_CFG', local_log_filename: Optional[str] = None) → None¶ Setup logging for python project
Load YAML config file from default_path, or from the environment variable set by env_key. Falls back to default config if file not exist.
if local_log_filename set, add a local rotating log file.
-
deepword.utils.setup_train_log(model_dir: str)¶ Setup log for training by putting a game_script.log in model_dir.
-
deepword.utils.softmax(x: numpy.ndarray) → numpy.ndarray¶ numerical stability softmax
-
deepword.utils.split_train_dev(game_files: List[str], train_ratio: float = 0.9, rnd_seed: int = 42) → Tuple[List[str], List[str]]¶ Split train/dev sets from given game files sort - shuffle w/ Random(42) - split
- Parameters
game_files – game files
train_ratio – the percentage of training files
rnd_seed – for randomly shuffle files, default = 42
- Returns
train_games, dev_games
- Exception:
empty game_files
-
deepword.utils.uniq(lst)¶ order-preserving unique
