Symmetri Developer Blog

November 4, 2009

Dividing the perimeter of an ellipse into equal segments

Flash/Flex, Algorithms - By Shourov Bhattacharya

Suppose we want to lay out N number of objects on the perimeter of a circle, equally spaced. Finding the position of each object is trivial; we simply divide the total angle 2*PI by N and then draw out a point at the constant radius at that angle. However, doing the same thing on an ellipse is surprisingly tricky. Using the same algorithm as the circle in polar form does not yield the right results; the points end up “bunching up” on the minor axis. To find the correct algorithm, we need to think parametrically - we parametrize the elliptical curve by a variable theta and then plot points at equadistant intervals projected on the linear axis of theta:

var theta:Number = (Math.PI*2)*(i/NUMBER_OF_SEGMENTS) - Math.PI/2;
var a:Number = 100;
var b:Number = 80;
var x:Number = a* Math.sin(theta);
var y:Number = b*Math.cos(theta);
var point:Point = new Point(x, y);

August 27, 2009

How random is the .NET Random class?

General, .NET/C#, Algorithms - By Shourov Bhattacharya

The Random class in the .NET framework is used to generate “random” numbers. I use quotations around the word “random” because the output of the function Random.Next() is actually what is known as “pseudo-random”; it is the output of a deterministic mathematical algorithm that “mimics” randomness. Specifically, it is an implementation of Knuth’s subtractive random number generator, which is an algorithm that performs incremental lookups on a large table of values and outputs the difference between selected numbers. The output is close enough to random for most practical purposes encountered in software development, with the exception of cryptography (for which purpose the .NET framework also includes the System.Security. Cryptography.RandomNumberGenerator class).

Recently I was confronted with an interesting question in the line of work: just how “random” is the Random class? Is there a way to quantify the “randomness” of a so-called random number generator (RNG)? One can imagine a linear scale of RNGs ranked by “randomness” - at one end would be entirely predictable, non-random generators (such as a routine that returns a constant value); and at the other end what we might call “true” random processes that occur in nature such as atomic decay1. In between, all other RNGs that one can think of would lie somewhere along that scale. But how exactly can they be ranked and “randomness” quantified?

It turns out that there are several suggested ways of measuring randomness. They all start by taking a sample output from the RNG in question, and one that is of sufficiently large size (bit streams of about 12MB have been mentioned in the literature); and then subjecting that output file to tests. There are many different testing processes that have been proposed for the task. Below are a selected few:

1) compression - a simple test is to compress the sample output using commercial compression formats such as ZIP and then to compute the ratio of compressed to uncompressed size. This test is really only good at finding large biases in your RNG; i.e. if the compression is able to siginificantly reduce filesize, you should assume that the RNG is producing output with definitive patterns

2) a visual test of Monte-Carlo plot - by plotting the values of RNG output onto a two-dimensional array, any large scale patterns become apparent to the human eye. A great example can be found at RANDOM.ORG2

3) DIEHARD test - the DIEHARD series of tests was developed by Prof. George Marsaglia at Florida State University. It applies a number of statistical tests to the sample RNG output; the composite of their results can be used to make an overall pass/fail judgement. It is a well-regarded process that is used, for example, to certify the fairness of the operation of online gambling sites

4) test for FIPS 140-2 - FIPS 140-2 is a standard maintained by the National Institute of Science and Technology in the United States that concerns itself with the security of cryptographic providers. Section 4.7.1 outlines requirements for approved RNGs; they are based mainly around the idea of entropy, which very loosely speaking is a measure of the information content of a sequence

Exactly how far one wants to go in testing a RNG for randomness depends on the context. For commercial non-cryptographic applications, it is probably good enough to use the built in RNGs that come with software development platforms - such as the .NET Random class (a caveat is that the seeding process for the RNGs should use a source that incorporates some natural uncertainty - such as a combination of time, local machine state and user input). For stronger “randomness”, there are a number of implementations of RNGs floating around in the public domain that can be used (e.g. a popular one is the Mersenne Twister). RNGs with a very high “randomness” have to rely on external, physical sources of entropy such as atmospheric data or atomic decay, and are probably only warranted in very specialized scientific applications.

1There remains the philosophical question of what is “truly” random; for example, do the quantum mechanical laws that govern atomic decay generate “output” that is the result of some as-yet undiscovered deterministic process? The answer is probably no, but it’s a question that exercises the minds of some of the world’s best physicists and philosophers.

2The demonstration shows the alarming large scale patterns inherent in the RNG that is the PHP rand() function - alarming because most developers think of functions like that as “true” random number generators

August 6, 2009

Bulk data import from CSV to SQL Server CE

General, SQL Server, Algorithms, WPF/Silverlight - By Shourov Bhattacharya

SQL Server Compact Edition is a great little product - a free version of SQL Server that can be easily deployed on clients for mobile and desktop applications. However, it does of course lack many of the less essential features of the SQL Server product. One of the things that it does not support is a way to do a bulk import of data into the database - the BULK INSERT T-SQL command is not available.

Currently I’ve got a project in which I need to get large datasets (400,000+ rows) from a CSV file into a SSCE database in the context of a WPF application. Firstly I tried the most conceptually simple (but probably least efficient) method - I read the CSV file line by line and do an INSERT into the SSCE database using the standard ADO.NET Connection and Command objects. But is is slow. Performance on my run-of-the-mill setup here (WinXP SP2 on an average machine) is of the order of 100,000 rows being imported every hour - which means my first data file of half a million records takes almost five hours to import!

[An added complication is that I need to check for duplicates before INSERTing, and that SSCE doesn’t support the IF EXISTS(…) T-SQL construct either - meaning that I have to do an extra query for every record to check for existence of a duplicate before INSERTing.]

What to try next? It seems that there aren’t many options - I can’t find an alternative to using ADO.NET objects in this context, and there seems to be only one way of inserting data. However, there is plenty of scope for streamlining the way the INSERT is done. Watch this space …

UPDATE: Well, massive performance improvement straight away by keeping an open connection instead of creating one each time. I have now got around 3.6 million records being imported in an hour, as compared to 0.1 million before - about 36 times faster! Taking the new Connection() out of the loop was all that was required.

That is, I went from here:

for (...)
{
    // create connection for every query
    using (SqlCeConnection connection = new SqlCeConnection(connectionString))
    {
        string sql = INSERT into TABLE ….;
        SqlCeCommand cmd = new SqlCeCommand(sql, connection);
        cmd.ExecuteNonQuery();
    }
}

To here:

// create connection once
SqlCeConnection connection = new SqlCeConnection(connectionString);
	
// now re-use same connection over and over
for (...)
{
    if (connection.State != System.Data.ConnectionState.Open) connection.Open();
    string sql = INSERT into TABLE ….;
    SqlCeCommand cmd = new SqlCeCommand(sql, connection);
    cmd.ExecuteNonQuery();
}

Very basic optimization of course - programming 101. However, I was surprised by the scale of the performance improvement - I would not have expected the revised code to be one and a half orders of magnitude faster. Is creating a new ADO.NET Connection really such a costly procedure? And what does this mean for the using keyword? The disadvantage of removing the using{...} block now is that I have to add my own code to clean up and close connections in the case of errors to reduce the risk of memory leaks.

I was going to explore more optimization, but this performance is now good enough in the context of this project. A bulk data import facility for SSCE would be nice though. Given that WPF applications run on the desktop a lot of the time, large datasets are going to be feasible to work with, given that there is no need to pull anything over a network for display in a UI grid etc. In that context, the ability to make bulk changes to data is not an unreasonable thing to want. On the other hand, it looks like basic ADO.NET will do the job okay - at least for me!

January 7, 2009

Search ArrayCollection for a property on an Object

Flash/Flex, Algorithms - By Shourov Bhattacharya

A common scenario is to have an ArrayCollection of Objects and to want to find a particular Object using some sort of criteria. Now Actionscript does provide the ability to find an element within an ArrayCollection using the getItemIndex(element) method, but for that you need to provide the exact element you are looking for as an argument. There is no good way to find an element given a match on a property - kind of like a search rather than a lookup. An implementation of that as a utility function is very simple*, but I can’t think of a better way to do it:

// search an ArrayCollection for a property on an object
static public function getItemIndexByProperty(array:ArrayCollection, property:String, value:String):Number
{
   for (var i:Number = 0; i < array.length; i++)
   {
      var obj:Object = Object(array[i])
      if (obj[property] == value)
      return i;
   }
   return -1;
}

*Of course, this only works with String properties.

August 7, 2008

Selectively removing items from a Generic List

General, .NET/C#, Algorithms - By Shourov Bhattacharya

Suppose, I have a Generic List of objects, of a custom class that I have created, and I wish to remove some of the items from the list using a filter criteria that uses the properties of that object. For example, I have a list of OrderItem objects, and I want to remove all items that have a Price property of less than ten dollars. What is the best way to do it?

A first attempt might be to iterate over the list and remove items one by one:

foreach (OrderItem orderItem in orderItems)
{
if (orderItem.Price < 10.00M)
{
// need to delete this item
orderItems.Remove(orderItem);
}
}

Easy, right? Unfortunately, this code throws an error; because the List is modifed during an enumeration, and that is not allowed. To get around this, we can do a copy to a second list, enumerate over the first and remove from the second, then copy the second back over the first:

List<OrderItem> _orderItems = new List<OrderItem>();
foreach (OrderItem orderItem in orderItems)
{
_orderItems.Add(orderItem);
}
foreach (OrderItem orderItem in orderItems)
{
if (orderItem.Price < 10.00M)
{
// need to delete this item
_orderItems.Remove(orderItem);
}
}
orderItems = _orderItems;

That works, but it’s not very elegant. Is there a better way? Well, another option is to use a regular while loop rather than a foreach:

int i = _orderItems.Count-1;
while (i >= 0)
{
if (orderItem.Price < 10.00M)
{
// need to delete this item
_orderItems.Remove(orderItem);
}
i--;
}

That is better; in fact, it’s probably as good as you will get, if you think in a procedural way - iterating through the list and removing the items that don’t belong. However, as of C# 2.0, there is another solution, one that uses a functional paradigm. We can use the List.RemoveAll() method, which takes a function delegate as its argument. The RemoveAll method runs the provided function on each element, and removes those items which evaluate the function to true. In this case:

orderItems.RemoveAll( delegate (OrderItem orderItem)
{
return (orderItem.Price < 10.00M);
});

The function delegate in this line of code is called an anonymous function, for obvious reasons. This kind of code is familiar to those who are used to programming with functional code environments such as XSLT and jQuery. C# 2.0 and C# 3.5 are taking some steps in the direction of functional programming; it will be interesting to see how far they integrate functional programming constructs into the language.

June 20, 2008

Simple Gesture Recognition

General, Flash/Flex, Algorithms - By Shourov Bhattacharya

NOTE: You must have a webcam to use this application

This is a prototype application that uses video input from your webcam to do simple motion recognition. It rotates the cube to the left/right if you wave your hand towards the left/right.

1) When the security popup appears, choose to allow Flash to access your webcam.
2) Your webcam image should appear in the small panel on bottom right
3) Wave your hand to the left/right slowly in front of you, as if you were pushing aside a curtain
4) The cube should rotate to the left/right in response to your gesture

The red areas on the webcam image show areas of motion. At present, we are using a very naive motion detection algorithm that tries to exclude gross motions and looks for horizontal movement of the motion area. You might find that you need to slow down your gesture to get it to work, or that you need to move back from the camera.

At this point, this is nothing useful. It’s a beginning towards a proof of concept. The idea is that if this type of motion recognition can be made accurate and robust enough, it could open up new possibilities for interaction with your computer without having to add any new hardware.



June 16, 2008

Simple gesture recognition

General, Flash/Flex, Algorithms - By Shourov Bhattacharya

We have released our first, very basic prototype of gesture recognition using Flash and AS3. Basically, it uses a simple motion detection algorithm to look for horizontal motion and registers each discrete movement to the left as an event. The idea is that you can fire the event with a wave of your hand, which then triggers some kind of action within the application. At this stage, it’s not all that accurate or robust - you can’t move too fast, or too close to the camera, and even at an ideal distance it seems to miss a movement now and then. But it’s a start towards building a proof-of-concept for gesture recognition in Flash.

http://www.symmetri.com/mcontrol

March 27, 2008

Webcam fun with Flash

General, Flash/Flex, Algorithms - By Shourov Bhattacharya

We are well on the way to putting together a prototype of a motion-controlled environment in Flash using the webcam. The thinking is to capture video from the webcam, and then periodically call a server backend that uses AForge.NET for image processing and AI, and then present data back to the Flash client that can then be used for UI interaction or something similar. draw.logic blog has a decent little rundown of webcam capture and simple motion detection using Actionscript, and Mr doob has a nice example of an arty 3D transform of webcam content. There are a few other examples out there, but no one really seems to have taken the bull by the horns and tried to harness the image content in real time to do something more useful. There are challenges of course, but the possibilities are exciting. If we can build even a simple feedback system that is robust and has adequate accuracy, the applications will surely follow.

January 29, 2008

AForge.NET library for AI

General, .NET/C#, Algorithms - By Shourov Bhattacharya

I have been researching and sketching approaches to implementing ALife and AI applications for a while now, and I can’t believe I never had a good look at this - AForge, an open-source C# library developed for Computer Vision and AI.

I have just downloaded the latest version to browse just minutes ago, and already it looks mouth-watering - neural networks, genetic algorithms, machine learning, image processing … it has already covered more of the AI space than I would have expected (although much of it is sparsely covered, of course - this type of project is always a work in progress).

I am especially interested in the blob counting and filtering algorithms for possible applications with maps. Google maps and satellite images show enough detail to discern buildings and other structures with the naked eye - can an intelligent combination of edge detection, connected component (blob) finding and counting and other algorithms make sense of maps in interesting and useful ways? And can the power of 3D rendering and animation in Flash be used to present that insight in a new and compelling interface?

I hope that as we work in these areas we can provide some answers in the affirmative.

January 22, 2008

Adjust image brightness in C#

.NET/C#, Algorithms - By Shourov Bhattacharya

There are a couple of ways to do this. This method does not use ColorMatrix.


private static System.Drawing.Image _AlterBrightness(System.Drawing.Image bmp, int level)
{
   Graphics graphics = Graphics.FromImage(bmp);
    if (level == 50)
    {
       // do nothing
       return bmp;
    }
   else if (level <50)
    {
       // make it darker
       // Work out how much darker
       int conversion = 250 - (5 * level);
       Pen pDark = new Pen(Color.FromArgb(conversion,0,0,0), bmp.Width*2);
       graphics.DrawLine(pDark,0,0,bmp.Width,bmp.Height);
    }
    else if (level >50)
    {
       // mark it lighter
       // Work out how much lighter.
       int conversion = (5 * (level - 50));
       Pen pLight = new Pen(Color.FromArgb(conversion,255,255,255), bmp.Width*2);
       graphics.DrawLine(pLight,0,0,bmp.Width,bmp.Height);
    }
    graphics.Save();
    graphics.Dispose();
    return bmp;
}

Get free blog up and running in minutes with Blogsome
Theme designed by Janis Joseph