These are the few things which are important in ML. I too sometimes forget these.. :P
- Use version control.
- Separate code from data.
- Separate input data, working data and output data.
- Modify input data with care.
- Save everything to disk frequently.
- Separate options from parameters.
- Do not use global variables.
- Record the options used to generate each run of the algorithm.
% store the results
serialise(options, 'options.dat', options.working_path);
serialise(parameters, 'parameters.dat', options.working_path);ORserialise(parameters, 'parameters.dat'], ...
[options.working_path '_' options_configuration.name]);
- Make it easy to sweep options.
- Make it easy to execute only portions of the code.
run_experiment('dataset_1_options', '|preprocess_data|initialise_model|train_model|');
- Use checkpointing.
% set the options
options = ...
% load the data
data = ...
if saved_state_exists(options)
% load from disk
[parameters, state] = deserialize_latest_params_state(options.working_path);
% command line output
disp(['Starting from iteration ' state.iteration]);
else
% initialize
parameters = init_parameters();
state = init_state();
end
% learn the parameters
parameters = train_model(options, data, parameters, state);
- Write demos and tests.