About the author

Miron Abramson
Me
Software Engineer,
CTO at PixeliT
and .NET addicted for long time.
Open source projects:
MbCompression - Compression library

Recent comments

Authors

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2014

Creative Commons License

Blog Flux Directory
Technology Blogs - Blog Top Sites

Uniform Distribution algorithm in C#

Background
Where I work, one of the serviced we give to our clients is Medical articles.With the time, we have tones of articles. Millions. All PDF files, and all are in the same folder.

The Problem
It became impossible to open the folder with Windows explorer, trying to search, copy or move files. I had to find a way to reorder it, and divide it into 1000 folders that every folder will have around the same number of files, and when a request for a specific file will come, I will be able to know in witch folder it is.

The Solution 
I contacted my brother Ari (his site is outdated) for help. He have PhD from the "Electrical Engineering department" at the Technion institute.  He is the smartest guy I have known. So, that is his Algorithm, I just implemented it in C#. It's not too complicate, but it does the job perfect.

The algorithm gets a string and maps it into the set 0-999 (or any given range) with uniform distribution.

/// <summary>
/// Get the position in the range of the specified string
/// </summary>
/// <param name="input"></param>
/// <returns></returns>
public int GetPosition(string input)
{
    if (string.IsNullOrEmpty(input))
    {
        return 0;
    }

    input = hasher.ComputeHash(input).Replace("-",string.Empty);
    double Sum = 0.0;
    int Aj;
    double Tj;
    double Pj;

    for (int j = 0; j < input.Length; j++)
    {
        Aj = (int)input[j];
        Tj = (Math.PI * (1 + j%5) / 2);
        Pj = Math.Pow(Aj,Tj);
        Sum += Math.Round((Math.Ceiling(Pj) - Pj) * Range) * Tj;
    }
    return ((int)(Sum % Range)) + LowerValue;
}

To use it, just create an object from type Mapper, and call the 'GetPosition' method for any needed string:

Mapper aMap = new Mapper(RANGE);
int position = aMap .GetPosition(input)); 

The Results 

I run the algorithm few times with random strings 10 chars long and map them into some ranges of 'folders'. Here are the results:

1,000,000 random strings into range of 1000: the folder with the maximum # of files contained 1156 files, and the minimum  contained 886 files.
1,000,000 random strings into range of 100: the folder with the maximum # of files contained 10,699 files, and the minimum  contained 8800 files.
1,000,000 random strings into range of 10: the folder with the maximum # of files contained 106,873 files, and the minimum  contained 98,747 files.

Very nice results. Maybe not perfect, but certainly good enough for this kind of purpose!

Special thanks to my brother Ari for his help.

Source code:

AriMap.zip (2.34 kb)

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Categories: C#
Posted by Miron on Wednesday, April 02, 2008 8:58 AM
Permalink | Comments (0) | Post RSSRSS comment feed

Related posts