History.txt

Path: History.txt
Last Update: Fri May 30 18:23:58 -0700 2008

0.9.3 2008-05-22

Skynet::Manager and Skynet Script Runner

 - Rewrote how Skynet workers and skynet manager talks on each machine.  See below for more info

 - Added an examples/ directory with sample skynet apps.

 - Support starting Skynet with 'skynet start/stop' to daemonize
 - skynet_install now only installs a config/skynet_config.rb in your applications directory.  You are no longer supposed
 to have a script/skynet to start skynet.  Instead, as long as you have a config/skynet_config.rb you can just run
 'skynet start/stop' from within your application_directory/
 If you're installing in a rails application, it will install a default config/initializers/skynet.rb which merely requires config/skynet_config.rb
 - The config file can also be specified with --config= skynet start

 - Gracefully handles trying to start skynet more than once
 - Close file handles on exec.
     Skynet::Worker and Skynet::Manager now call Skynet.fork_and_exec instead of their own versions.
     Skynet.fork_and_exec prevents file descriptor exhaustion by calling Skynet.close_file_handles.
     Skynet::Manager detatches from console by calling Skynet.close_console

 - Dramatically improved Skynet shutdown time
 - Huge performance improvements in taking/putting tasks and results.  Much lower resource utilization, especially on mysql.
 - You now have to specify your PIDFILE and LOGFILE dirs and files differently. This will break all old skynet runners. Sorry.
 - Skynet runner handles default config variables better
 - Fix serious bug in Skynet::Partitioners::RecombineAndSplit where it wouldn't handle empty results well.
 - Fixed Skynet::Partitioners::ArrayDataSplitByFirstEntry to handle strings as keys better
 - Added Skynet::Job.results_by_job_id to retrieve results from asyncronous jobs
 - Added printlog logging method which always prints to the log as [LOG]
 - Deprecated Skynet.new to Skynet.start
 - Mysql Message Queue Adapter - Make delete_expired_messages much safer.
 - rename ActiveRecord::Base.distributed_find.each to ActiveRecord::Base.distributed_find.map
 - ActiveRecord::Base.distributed_find - Patch submitted by Lourens Naude (lourens@methodmissing.com) which checks the model for the primary_key name as opposed to assuming it is
  'id'
 - We don't want to use rails constantize so I've temporarily borrowed the method from ActiveSupport inflector and added it to skynet_ruby_extensions.
 - Fix bug in Job comment where it referenced MapreduceTest instead of Skynet::MapreduceTest
 - Fix tests. For some reason you still can't run ALL the test at once with rake test, but if the files are run individually they all pass.
 - Change mysql text fields to longtext in migration and schema files
 - Include some extras including our init.d script, our nagios monitoring script, rails controller and view for monitoring
 - Created a new skynet rails initializer which gets installed from skynet_install --rails
 - Modified the skynet_install skynet runner to take skynet initializer into account
 - Skynet::ActiveRecordExtensions Fixed a bug where it would fail if your table had fewer than 1000 rows.  It performs a count
  first now to make sure there are enough rows.
 - Introduced some new Skynet::Config methods for getting logfile and pidfile locations

Skynet Manager/Worker Refactor

  The workers used to publish their worker statuses to the skynet_worker_queue which lived in the same Q space as the skynet_message_queue.  skynet managers would then query that queue for their workers' statuses.   This was a very inefficient use of central resources.    NOW, workers communicate with their manager vir DRb, calling manager.worker_notify(status).   This adds the worker status hash onto a local Queue object stored in the manager.  A separate thread watches that queue and updates the internal manager information about it's workers.  This was in place of polling a queue every N seconds for new worker records.

Managers now save their worker information to a file periodically so they can be reloaded on a restart. This means managers can keep track of how many tasks all of their workers have done even after restarts. As a consequence of decentralizing worker stats, you now have to ask all your managers for their individual stats along with the main message queue stats. Added a stats_for_hosts method to Skynet::Manager which aggregates stats accross many managers.

0.9.2 2008-01-22

Highlights:

 - Multiple Message Queues
 - Many more Job options including options to control how jobs are distributed.
 - The various options for how a job is run has been made much clearer.
 - The Mysql Message Queue Adapter has been optimized and made more reliable.
 - You can now control how many times skynet retries failed master, map and reduce tasks.
 - Large data sets can now be streamed to the queue.

Details:

Active::Record#distributed_find

  - Active Record distributed_find now handles REALLY large sets by breaking them into seperate jobs. 1MM models per master broken into ranges of 1000.

Skynet::Job

  - code path through Skynet::Job is now clear. There are 3 ways to run Skynet::Job. Local Master (default), Remote Master, Async (implies remote master)
  - Skynet::Job supports keep_map_tasks and keep_reduce_tasks settings.
    If true, the master will run the tasks locally.
    If a number is provided, the master will run the tasks locally if there are LESS THAN OR EQUAL TO the number provided
    There are also Skynet::CONFIG settings for defaults. DEFAULT_KEEP_REDUCE_TASKS, DEFAULT_KEEP_MAP_TASKS
    I can see there being a problem with the timeouts being a little off... Since your kinda in a master and kinda in a map or reduce timeout.  The task timeouts will be correct at least.  Though, you won't get the benefit of redoes yet.
  - Skynet::Job now supports setting RETRY times per job by MASTER, MAP and REDUCE. So you can have a MASTER_RETRY=0, but have MAP_RETRY=2 and REDUCE_RETRY=3.  There are now defaults for those as well :DEFAULT_MASTER_RETRY, :DEFAULT_MAP_RETRY, :DEFAULT_REDUCE_RETRY.    If a message passes its RETRY it will be marked with an iteration of -1.   delete_expired_messages removes those messages as well.  These show up in the stats as :failed_tasks.  Skynet::Task and Skynet::Message now have retry fields denoting the maximum number of retries.
  - You can now pass queue_id or queue to Skynet::Job
  - There is now a :MAX_RETRIES config setting that controls how many iterations Skynet will even look for tasks as well.
  - You can now stream map_data to the queue by passing an Enumerable for your map_data
  - Refactored Skynet::Job to be much cleaner and easier to test.
  - Skynet::Job now has access to a local queue which it can treat almost like the real one.
  - deprecate Skynet::Job#run_master
  - rename reduce_partitioner method to just reduce_partition
  - Skynet::Job has better support for running tasks locally.  A job may run tasks in its own process if
    you are running in solo mode, you've made a "single" job, or you set the keep_map_tasks or keep_reduce_tasks below.
    When a job runs tasks locally it now honors the retry settings and timeouts.
  - Skynet::Job and Skynet::AsyncJob are now almost identical.  In fact you can just use Skynet::Job and tell it to run async.
  - Skynet::Job Changed map_tasks and reduce_tasks to mappers and reducers respectively.  This was to remove the ambiguity between the actual map/reduce tasks and the number of mappers/reducers desired.
  - Skynet::Jobs can not be told what queue to use for that job by passing :queue or :queue_id DEFAULT 0
  - Skynet::Job won't call the reduce_partitioner if there are no valid results from the map_step.

MapreduceHelper mixin

  - You can include MapreduceHelper into your class and then implement self.map_each and self.reduce_each methods.  The included self.map and self.reduce methods will handle iterating over the map_data and reduce_data, passing each element to your map_each and reduce_each methods respectively.  They will also handle error handling within that loop to make sure even if a single map or reduce fails, processing will continue.  If you do not want processing to continue if a map fails, do not use the MapreduceHelper mixin.

Multiple Message Queues!

  - Add the ability to have multiple message queues in the same table message_queue_table.
  - You can start skynet with a --queue_id or --queue option to determine which queue workers should look in.
  - Skynet::Jobs can not be told what queue to use for that job by passing :queue or :queue_id.  DEFAULT 0
  - Queues can be configred via Skynet::CONFIG[:MESSAGE_QUEUES] = [] which comes with an array of queues id 1 through 10 named "one" through "ten"

Skynet Console

  - You now start the skynet console by running 'skynet console' at the command line.   There is no longer a skynet_console app.
  - The console now loads the configs that are in your local skynet script.

Skynet::Config

  - Added Skynet.silent {} Runs your code with no debugging output.
  - Made sure all config options for TupleSpace? adapter begin with TS and all Mysql adapter CONFIG settings start with MYSQL.
  - There is now a :MAX_RETRIES config setting that controls how many iterations Skynet will even look for tasks as well.

Skynet::Partitioners

  - Created a Skynet::Partitioners class where various partitioners can be found.  To specify one of them merely provide that specific Skynet::Partitioners subclass as your reduce_partitioner in your Skynet::Job

Skynet Install

 - skynet_install now has a --mysql option.  This installs the migration as well as a skynet_schema.sql file.

Mysql Message Queue Adapter

  - You can now configure your database options outside of rails with Skynet::CONFIG[:MYSQL_*] options.
  - Mysql adapter now updates the updated_on time of rows in skynet_message_queues
  - Fixed Mysql Message Adaptor to take_next_task safer and more efficiently.   There seems to be far less risk of a race condition where two workers would take the same task.
  - Eliminated 1 db update per every task taken making it MUCH more efficient.
  - Skynet::MessageQueueAdaptor::Mysql now tries to reconnect if it gets disconnected.   This was to solve the "Mysql Server has Gone Away" errors.
  - Implemented version_active? in mysql message queue adapter.   It's a way for workers to check to see if a version is still in the queue.

Skynet::Task

  - Skynet::Task#master_task takes care of creating the master job and task now. I have mixed feelings about this.
  - ENFORCED TIMEOUTS - Even though a master might give up on a worker if it didn't respond in time, there was nothing to step any given worker from running forever.  We now enforce the timeouts (master_timeout, map_timeout, reduce_timeout) given in Skynet::Job using the Timeout module.  This causes a Timeout::Error to be thrown.   If you are using the mysql adapter, this can cause strange results sometimes.   If the Timeout error is thrown during a DB query, ActiveRecord will throw an ActiveRecord::StatementInvalid exception which includes the Timeout::Error exception in it.  Not sure how to prevent that from happening.

Skynet::Worker

  - Workers now have a Skynet::CONFIG[:WORKER_MAX_PROCESSED] setting to control when to respawn based on how many that worker has processed.
  - You can start skynet with a --queue_id or --queue option to determine which queue workers should look in.
  - Workers do not restart until there are no more items in the queue of that version.

Skynet::Message

  - Skynet::Message now stores fields as an array.  It is far more efficient now as well.

Now 90% more Tests!

BUGFIXES

  • Skynet Workers now restart properly when the worker_version changes.
  • starting tuplespce_server, you no longer need to provide the —port if you‘re already providing the drburi-
  • Fixed a bug where Skynet::MessageQueueAdapter::Mysql would sometimes pick up tasks another worker had already picked up.
  • Fix bug in Skynet::Worker where it wouldn‘t die right if the max processed was reached.
  • Workers were supposed to restart when the worker_version changed. They do that properly now.

Thanks to Jason Rimmer for finding these bugs.

  • Fix bug in Skynet::Message where it would calculate the iteration improperly.
  • The skynet gem appears to be missing the Rubigen dependency
  • Running skynet_install even without the rails arg still generates code with rails dependencies and tailings: RAILS_ROOT, RAILS_ENV, and the various directory tailings such as ‘db/migrate’, etc.
  • Generated skynet script is missing "require ‘rubygems’"
  • Specification of pid directory and file is incorrect as ‘skynet_manager.rb’ wants only a directory with it specifying the file
  • The sleep while waiting to start the queue server isn‘t long enough. There is now a CONFIG setting TS_SERVER_START_DELAY.

0.0.1 2007-12-16

  • 1 major enhancement:
    • Initial release

[Validate]