You are viewing ogorun

Testing private Ruby methods

hermiona
First of all, testing private methods is arguable practice. Its discussion requires separate post. For now I just say that we do it.

Several months ago my colleague found elegant method to test private Ruby object methods published by Jay Fields http://blog.jayfields.com/2007/11/ruby-testing-private-methods.html

require 'rubygems'
require 'dust'
require 'test/unit'

class Ninja
private
def kill(num_victims)
"#{num_victims} victims are no longer with us."
end
end

class Class
def publicize_methods
saved_private_instance_methods = self.private_instance_methods
self.class_eval { public *saved_private_instance_methods }
yield
self.class_eval { private *saved_private_instance_methods }
end
end

unit_tests do
test "kill returns a murder string" do
Ninja.publicize_methods do
assert_equal '3 victims are no longer with us.', Ninja.new.kill(3)
end
end
end


This approach allows to access private methods in convenient and compact way like regular code does. Additional advantage we have for Ruby 1.9 where #send method doesn't work for this purpose. The only disadvantage is that this approach works only for instance methods but not for singleton ones.

So here is my change adding support for singleton methods:

class Class
  def publicize_methods
    saved_private_instance_methods = self.private_instance_methods(false)
    @@saved_private_class_methods = self.private_methods(false)
    self.class_eval { public *saved_private_instance_methods }
    self.class_eval do
      class << self
        public *@@saved_private_class_methods
      end
    end
    yield
    self.class_eval do
      class << self
        private *@@saved_private_class_methods
      end
    end
    self.class_eval { private *saved_private_instance_methods }
  end
end

One of silly problems that take days

hermiona
Here is example of simple and silly problem that takes days to solve.

Before two days I finally found time to start internal Wiki for company I work in now. I added first variant of structure, published several first articles and sent email to coworkers to see and use it. After that I started getting feedback from different people that they cannot get to site with Wiki and some strange HTTP Authentication required. Need to say that site containing Wiki is located on one of servers with most quite and stable situation. There is no testing, staging, big load problems. Regular Amazon instance with Ubuntu. Just several third party applications used AS IS in stable versions and providing development environment like issue tracking server, continuous integration server and so on.

Well, there is just one application requiring HTTP authentication on this server but coworkers meet such requests for several different applications from this server located on different virtual hosts of Nginx. Even worse that problems seams to appear and disappear for random computers and in random periods. For example, I don't experience such problem at all.

Our sysadmin notices that HTTP Authentication is requested by Tomcat that surprises me even more because I don't remember any application using Tomcat on this server and don't see it in processes. Ok, after additional check it seams that one of application IS working on modified version of Tomcat working with different name. But it still doesn't explain how Nginx can interact with it when requested about different virtual hosts. Result of the day: sysadmin reinstall-es Nginx and problem seams to be solved.

But next day the same problem is repeated. Anyone doesn't have time to work on it.

Today already I meet this problem connecting from home. I try to exclude suspicious application from Nginx (Nginx is used here as reversed proxy only) and to request it directly by different port. Here I meet another surprise: I cannot get to the application even after changing security rules regarding port. Even more interesting thing, after such exclusion I get from port 80 and its virtual host some other application definitely not used by the company. Its name, something like VisualClick causes me to think about security  issues and hacker techniques. But I experimented on Saturday and Saturday in Israel is not the best day to disturb sysadmin without really good cause.

Finally I start checking IP and discover that request to different virtual hosts go to different IP. Check of DNS records shows that for every host on this server there are three records with different IP addresses. Here is the cause! Probably already for some time all requests go randomly to different IPs till getting response from right one. What happens that we started to notice that? It seams some another server in Internet got one of these IPs. And this server is one hosting Tomcat-based application sending HTTP Authentication requests. I removed invalid IP addresses from DNS and this solved the problem.

Another question - why there were invalid DNS records? And that reminded me one more application of this server. It hosts Squid. And there is periodical change of this server elastic IP through AWS API. Script updating IP updates with new IP also DNS through another API. Probably there is bug in script caused not accurate work with DNS.

So, result of the day: one problem is solved, another one is discovered. But spending (even partly) four days for such thing sounds too much!

Some conclusions to the birthday

hermiona
Today I got notification from LiveJournal about my own birthday and this reminded me about this blog I started before two years and didn't continue. So my own coming soon birthday is good opportunity to try again and start from summarising of changes for these two years.

So what happened for two years. Main changes:

Switch from PHP to Ruby as every day language
In addition to simple fact it is choice of company I work for now :) , it appears to be very nice and expressive and very convenient for rapid object oriented programming (significantly more than PHP). There are two features I would like to mention especially:  mixing and closures that add to code very high flexibility. And its the best-known framework: Ruby on Rails is convenient and rich. It has also large community and many pre-ready third part libraries to use.

Minuses: first time after switching to Ruby I need to think about CPU and memory resources on computer I work. Performance is also common-known issue that leaded to completely other techniques of web  application. Even the most simple web application is build with division to main application and background worker dedicated to work with all can be done not immediately. I cannot say that you doesn't meet such need working on another language (the closest to me example is Drupal with its hook_cron moving to cron handling of all the stuff that can wait). Simply working on Ruby you will meet need in such division significantly sooner.

I would summarise my experience of last year in this comparison this way: it you need to build web application quickly (for CMS look at Drupal/WordPress), try to start with Ruby on Rails and exclude from it parts written on other languages if you need. Such parts can interact with main application as web services, through message queue or through commonly used resources of periodical tasks for example.

There was also some small projects on Java but it can be considered only as a start

Some migration from web programming
Project I work with now can be named web-project but it's web part (REST/CRUD/client side work/web-specific session, security and load/performance questions) take, I would say, 20% of the project. System is divided now into 9 independent projects (mostly on Ruby but java also in use) and uses 13 types of differently used servers (several dozens of real and virtual servers). For me it's a school of questions about scalable architecture and distributed calculations and some small step to direction of Artificial intelligence. Hope to have opportunity to work more in this direction.

Some technologies I want to mention here:
Scalability:
  • Parallelization of work with message queue
  • Background processing with scalable daemons
  • Solr search engine server
  • Cloud techiniques usage (Amazon services): EC2 (hosting), S3, Elastic IP, MapReduce
  • Event Stream Processing

Artificial Intellegence:
  • Rules Engines
  • Event Stream Processing and Complex Events Processing
OpenSource libraries
After several started and not finished previous trials to make something open-source (for Drupal mainly) there are two open source gems:

https://rubygems.org/gems/sunspot_activerecord
and
https://rubygems.org/gems/de

Not too bad. But can be much better. So I have what to do :)

Drupal deployment overview - continue

hermiona

Installation profile

Core Drupal method that allows including into installation of customized modules set, their settings and initial objects creation as a part of Drupal installation. Developer should write profile script implementing several helper functions. It is a good solution for initial installation of White Label but doesn’t provide means for content merging. So it doesn’t solve the main problem

 

Profile Generator module

http://drupal.org/project/profile_generator

Contributed module that automates installation profile generation on the base of current site. Process is configurable. There is UI for choice of object types to be included to profile. Now there are options to include into profile node types, roles, users, menus, blocks, url aliases. It decreases amount of manual work but functionality is not rich enough and no extensibility means are provided. It exists only for Drupal 5. Version dependent. Main problem is the same. It doesn’t support database content merge after initial deployment

 

Install Profile API and Profile Wizard

http://drupal.org/project/install_profile_api

One more contributed module automating install profile creation. A set of helper functions (aka CRUD) that makes it easier to work with Drupal data structures by providing some wrapper functions. Profile Wizard helps generate .profile files based around the crud.inc file.

Allows including almost the same types of objects (+ concrete nodes) into profile. There exist beta-versions for Drupal 5 and Drupal 6. Problems are the same

 

Update scripts

Core method that allows updating functionality (including structure changes needed in database). Update scripts can be run on production server without downtime at all and content transfer problem. Updating functionality is main purpose of this tool. Usually time to write certain update script is not large, it is not generic. Ideally no manual work after script run is needed. But update script is semantically connected to the concept of Drupal module. Its main purpose is to update state of database and file system to support functionality of current module’s new version. And not every structure/configuration change that needs mapping in database can be considered as one module-related. There is opportunity to write special module containing update script for every deployment. But such approach is a bit artificial and hard to call it systematic and well-organized. There should be means helping memorizing of various configuration changes and object-creation/deletion during development/bug fixing. There is not system in update scripts set and such solution is very sensitive to current database state. Usually one update script cannot be run twice on one server because it changes database state. It makes a task of writing and testing of proper update script for given server rather complicated.  Since update script is code, solution API-orientation and sensibility to Drupal version/DBMS/OS depends on developer’s skills and priorities. No guaranties provided by solution itself

 

Journal

http://drupal.org/project/journal

This contributed module allows to developers and system administrators to record and track all changes that have been performed to setup a site or alter its configuration. Journal also allows developers to maintain a log of applied patches and customizations on a Drupal site. For tracking of configuration changes the module. There exist versions for Drupal 5 and 6. Now log is written manually. Module provides additional field to chosen and mandatory to setting forms for tracking messages. The messages are put to log after form submit. It is one more solution working with database of production server so no content transfer needed. Its purpose is to lighten to developer repetition on production database changes performed during development/bug fixing on development structure database. Downtime depends on number and significance of changes. All real changes are manual. Also for all forms except module settings form log field is not necessary. In such cases solution usage requires some discipline. Developer is responsible to provide comments on every essential database change. Another opportunity is to consider Journal module usage in conjunction with other solutions. For example, it can be used with update scripts. Module exists for Drupal 5 and 6.

 

Patterns

http://drupal.org/project/patterns

This contributed module shares idea of installation profile of building site structure in one installation action but solves some installation profile limitations such as:

·         It's a onetime initial setup and can't change anything on the site after the initial configuration.

·         You can't create separate, smaller groups of features and configurations that can be combined together and mixed and matched individually or as a group.

·         It can be hard to read and work with since it's in PHP code and so there is disconnect between site planners/architects.

·         It's not possible to setup all the desired settings/configurations (i.e., API and additional module support.)

·         It's hard to manage changes to the profile or setup as requirements and best practices change (which they constantly do.)

·         Can't easily have slightly differing configurations (would need a separate install profile for each.)

 

It is a module that supports definitions of requirements (patterns of any type) and runs/enables them on a new or existing site. The module is very interesting. Since site functionality is divided into different semantic parts, they can be run on production server to change its structure without data lost. Patterns are defined by XML files. UI is proposed for choice and running of separate patterns. System is component-based. It is enough to add component to support additional functionality. Pattern removing is supported. Naturally there are also minuses in usage of this module for deployment purposes. The mostly significant minus origin is semantic incompatibility. Patterns are considered initially as building blocks of site structure/configuration. Trying to use them for structure changes from version to version or for merging branch changes leads to rather artificial patterns that removes or changes some very specific objects. Such patterns are not significant as site building blocks and too much connected to current database state. Similar situation was already discussed in description of update scripts usage. It is common problem for all solutions proposing changes of structure of production site. One more minus is not consequent API-oriented implementation of separate components. There is too much work on database abstraction layer of Drupal directly that makes current implementation more sensitive to Drupal version changes. For now module is available in alpha-version for Drupal 5 and development snapshot for Drupal 6. Currently it supports modules, module settings, CCK Content Types, Fields, Field weights/order, Views, Type, Fields, Panels, Blocks, Permissions, Content/Nodes (not finished yet), Users, Profiles, Taxonomies, Imagecache.

 

Deployment

http://drupal.org/project/deploy

Contributed modules package representing deployment framework. Framework includes Deployment API, Deployment Implementers and Deployment Services.

·         Deployment API - This implements the concept of a deployment plan

·         Deployment Implementers - Individual modules implement the deployment API to add the data they need to a deployment plan, and expose that ability to the front end.

·         Deployment Services - Services modules which contain the knowledge to receive deployed data and do what is appropriate on the destination server.

Solution works both on development and production servers. Custom deployment plan is generated on source side and site changes are pushed via XMLRPC to destination site. It is special solution for deployment. It is API-oriented and supports UI. It allows deployment of items to one live destination site from different source sites (possible solution of merging changes from different branches). It supports concept of plan including manageable operations order. Every deployment request executes separate deployment plan. Now just beta version for Drupal 5 is available. Too little functionality is supported. For now modules allow just copying of content types, system settings and views are supported. So frameworks author considers moving of new functionality defined by objects from development to production server. And it causes the problems described above. Actually the framework architecture allows inverted usage (now no suitable services and implementers are supported).

There are two problems in such solution.

1.    The main problem is that changes are situational and not repeatable. There is no support of pure site structure. In case of movement of changes to production site valid structure is not guaranteed by solution. In case of movement of changes to development server there is need to write additional plan items to clear database from unsuitable testing data.

2.    There is no mechanism providing functionality handlers interaction. For example, if mechanism moving nodes should be aware of changes in identifiers of taxonomies it is connected to.

 

Database merge

Another approach to the problem is to try to merge changes in database on production server with changes in database on development server on SQL level.  Examples: http://drupal.org/project/dbscripts, http://www.geocities.com/mergedb/. The main problem is that every such script is too sensitive to all types of possible changes since it is not API-oriented. It is sensible to Drupal version changes, separate modules version changes and used DBMS. Also such approach is hardly extendable. Addition of any new functionality that can interact with handled one requires revision of part of previously handled functionality.

 

Conditions allowing database merge in DBMS terms

Modification of previous approach is creation of conditions plain SQL database merge is less problematical. The main problem of database merge is database identifiers coincidence. For example, ID of node that is considered as part of site structure in development version can coincide with ID of another node from production version valuable as part of content. So finding the way to avoid ID coincidence permits relatively simple plain database merge in terms of DBMS. Approach examples:

·         Reservation of first 1000 (for example) ID for structure-related objects (http://www.dave-cohen.com/node/1066). For tables using sequence table it is enough to set on production database corresponding sequences ID to 1000 if it is less than 1000 and to change db_next_id() function implementation to handle case there is no sequence name in the table yet. For tables that rely on autoincrement changes should be performed on autoincrement step level.

·         Configuration of production and development server databases to have different autoincrement offsets and usage of replication for merge of changes (http://drupal.org/node/181128 ).

Both approaches need additional handling for tables that don’t use numeric primary keys. Except the obvious note that such solution is ‘hack’-type, its main problem is support of legacy systems that already not built such way.

 

 DAST

http://drupal.org/project/DAST

The main problem of Drupal-based system deployment is merging database changes of content data from production with structure/configuration changes from development. But every deployment includes also common steps that can be automated such as copying and deleting files, creating source version control tags, deleting and creating database and user for database and so on.  Project DAST addresses this part of deployment problem. It works with Phing framework (http://www.phing.info/trac/). DAST is a PHP CLI application for *nix/Window - it does not run inside Drupal and requires shell access and a PHP 5.2.x or above CLI interpreter. For now it is available for Drupal 5.

 

Solution

No solution from described above set cannot be considered satisfactory so there is need to build another solution. In part of database changing merge it should support concept of pure system structure/configuration since building such structure is more stable and repeatable process. But this structure should be divided into semantically distinguished parts lighten branch merges. After building right structure tool should fill it with content from production site. The tool should be API-oriented that minify its Drupal-version sensibility and allows to Drupal to take care about part of data not handled by special object type handler itself. It should support concept of moving from database to database set of objects of different types. Set of types should be extendable and support for Drupal 5 and Drupal 6 should be provided. Object type handler interaction mechanism should be provided. Tool should be configurable for different systems needs.

 

Solution should also take into account other aspects of deployment process. Integration with DAST can be considered

Drupal deployment overview

hermiona

Drupal deployment overview

Deployment process of most of systems includes several identical steps:

·         Transfer of files from file system

·         Transfer of database(s)

·         Configuration of system to suit current server environment and customer needs

·         Deployment process after the point site is online (connected to bug fixing or functionality changing) includes transfer of content data existing at time of deployment on production server.

 

For Drupal-based system last step is problematic. Part of system structure, configuration and design is content from DBMS point of view and they cannot be separated clearly on SQL level.

 

Existing methods

For Drupal there exist several methods of deployment. Their advantages and disadvantages are gathered in Table 1. Existing deployment methods and described in more details after the table.

 

Considered comparison aspects:

·         Automation degree. The most important factor. Which part of work is automated by solution? How much manual work should be performed after that?

·         Man/hour capacity. How much of development time it takes (both automatic and manual part)?

·         Downtime. Does solution requires downtime on production server? How significant is downtime in comparison with other solutions?

·         API orientation. Is solution API oriented or it is a “hack”-type solution? How stable is it? How sensitive is it to used modules bundle changes? To Drupal version changes?

·         Semantic consistency. Is deployment a real purpose of solution or it just can be used also for deployment? Isn’t it workaround? Does it have side effects caused by its main purpose?

·         Extendibility. There is no solution that supports all potential Drupal-based system functionality. Usage of every new Drupal module can require changes in deployment process. Does solution extendable? Does it support any type of plug-ins?

·         Configurability. On the other hand every concrete Drupal-based system at every stage uses just a subset of all available Drupal functionality and has specifics in its usage. Does solution configurable? Does it support configuration to specific system needs?

·         Branches support. System development usually includes work with version control systems and branches of development. For example, there can be a need for deployment of bug fixes while new functionality is under development and in unstable state in another branch/trunk (in terms of version control system). In such cases version control system supports merge of changes performed on file system level but not database changes. Even more so not Drupal-based system structure changes on database level. Can such merge be performed with help of given solution?

·         Drupal version sensibility

·         DBMS sensibility

·         OS sensibility

 

 


 

Aspects / Solution

Automation degree

Man/hour capacity notes

Downtime

API - oriented

Semantic consistent

Extendable/ Configurable

Branch support

Drupal version/DBMS/OS sensible

Installation profile

All content merge is manual

New module addition doesn’t add much work at automated part, but can cause a big amount of additional manual work

NA

+

Yes only for initial install of White Label

+/+

NA

-/-/-

Profile Generator module

All content merge is manual

A bit less that for plain installation profile

NA

+

Yes only for initial install of White Label

-/+

NA

+/-/-

Install Profile API and Profile Wizard

All content merge is manual

A bit less that for plain installation profile

NA

+

Yes only for initial install of White Label

-/+

NA

+/-/-

Update scripts

Ideally full automation

Time to write certain  update script is not big, it is not generic

-

?

Only in subset of cases

+/-

+

?

Journal

All database changes are manual

No code writing. Many manual changes through site interface

Depends on changes. Can be significant because all changes are manual

+

+

-/solution itself is type of configuration

+

+/-/-

Patterns

Ideally full automation

Time to write additional components for not supported functionality. Minimal time to configure Patterns themselves

-

Not enough in current implementation

Only in subset of cases

+/+

+

+/-/-

Deployment

Ideally full automation

Time to write additional services and implementers for not supported functionality. Minimal time to build Plan itself

-

+

+

+/+

+

+/-/-

Database merge

Ideally full automation

Functionality extension is time consumable relatively to other methods

+

-

 +

Hardly/+

-

+/-/dump parts only

Conditions allowing database merge in DBMS terms

Ideally full automation

Minimal developer time required in case of integration server usage

+

-

+

-/-

-

-/+/-

DAST

NA

NA

NA

NA

NA

?/+

NA

-/-/-

Table 1 Existing deployment methods