Tuesday, November 20, 2012

To use tips

These are the few things which are important in ML. I too sometimes forget these.. :P
  • Use version control.
  • Separate code from data.
  • Separate input data, working data and output data.
  • Modify input data with care.
  • Save everything to disk frequently.
  • Separate options from parameters.
  • Do not use global variables.
  • Record the options used to generate each run of the algorithm.
% store the results
serialise(options, 'options.dat', options.working_path);
serialise(parameters, 'parameters.dat', options.working_path); 
serialise(parameters, 'parameters.dat'], ...
              [options.working_path '_' options_configuration.name]);
  • Make it easy to sweep options.
  • Make it easy to execute only portions of the code.
run_experiment('dataset_1_options', '|preprocess_data|initialise_model|train_model|'); 
  • Use checkpointing.
% set the options
options = ...

% load the data
data = ...

if saved_state_exists(options)

    % load from disk
    [parameters, state] = deserialize_latest_params_state(options.working_path);

    % command line output
    disp(['Starting from iteration ' state.iteration]);


    % initialize
    parameters = init_parameters();
    state = init_state();


% learn the parameters
parameters = train_model(options, data, parameters, state);
  • Write demos and tests.
From  hunch.net which further has its root to link