Archive for March, 2009|Monthly archive page

How to shuffle an array

This is a very common programming problem, e.g. when you wish to show some images in random order, when you want to show a random quote, etc. I will show you a solution in JavaScript, but it can be ported to other languages easily.

The common solution to shuffling is to swap random elements, but swapping means you have to constantly work with two elements, and it can be done by using only one. Also, it’s common to randomly swap e.g. 1000 elements, but that wouldn’t work well for very large arrays.

All you have to do is follow a couple of simple steps:

  1. First, of course you start with a simple array, which you have in a specified order.
  2. You place a random element in the output array, and remove the element
  3. Repeat step 2 until the array is empty

Here’s the full code

function shuffle(r) {
	var pos;
	var out = [];
	while (r.length > 0)	{
		pos = parseInt(Math.random()*r.length);
		out.push(r[pos]);
		r = r.slice(0,pos).concat(r.slice(pos+1, r.length));
	}
	return out;
}

Let’s examine the parts. First, the variables pos and out are defined, and out is initialized to an empty array.

	var pos;
	var out = [];

No we “loop” through the array. But, do note that this is not a real loop, in fact we’re constantly going to remove elements until the array is empty. So the simple check on length is enough here.

	while (r.length > 0)	{

We find a random element. For large arrays the Math.random() method can be considered unreliable, but then I mean really large.

		pos = parseInt(Math.random()*r.length);

Next, we add the random element to the output array, using push. Then we remove the element from the original array. This is done in three parts:

  1. get everything on the left of the element
  2. get everything on the right of the element
  3. concat these two arrays to form a new array

That is a really difficult way of removing just one element, but deleting an element from an array is not a native javascript method. John Resig wrote another version of delete

		out.push(r[pos]);
		r = r.slice(0,pos).concat(r.slice(pos+1, r.length));

And last but not least we return the new array:

	}
	return out;
}

Example

Here’s an example of calling the function, shuffling 52 integers

var r=[];
i=52;
while (i--)
	r.push(i);
alert(shuffle(r).join(','));

Extending Array

If you feel so inclined you can make it an extension of the Array object in javascript, like so:

Array.prototype.shuffle = function() {
	var r=this;
	... //rest of code
}

Note the additional "r=this" line.

Drawbacks

There are some drawbacks and warnings to take note of:

  • It can be slow for huge arrays, I would recommend it for arrays under 1000 items
  • It takes up additional space, since it creates a new array
  • The delete can be improved, by using e.g. John Resig’s version

A shorter ‘hack’

The following is a short hack that can most definitely suit one time needs. It’s a form of sorting randomly, which sounds weird, and it is, but it works. However, if the random value is not seeded again, the next time you run it you end up with the exact same sequence.

function shuffle(r) {
  r.sort(function() { return Math.random() } );
}

For more background see e.g. the Fisher Yates shuffle method (which by the way this is not, but this is).

Javascript floating point fix

Javascript has an annoying bug feature, and it’s the wrong result of simple floating point calculations. Try this:

alert(0.1+0.2);

Surprise! The result is 0.30000000000000004.

This can be quite annoying when programming timers, and doing math of course.

Why is this? It’s because floating points work that way, it’s by design. That’s not really a flaw, it’s just to make the process real fast. Just like your pocket calculator has only 8 digits, which isn’t wrong per se, but it’s never fully mathematically correct. Sun Microsystems has some real hardcore background on the floating point issue.

But for us simple programmers, all we want is a fix, right?

Here is one at lars-sh.de. Or download the Math Library directly. Kudos to Lars Knickrehm.

Convert Excel Tables To Lists

note: please see the update on the bottom of this article for an even quicker way to convert a table

In a dark past I was an Excel instructor (among other things). I have trained countless people in the art of Excel number wizardry. I have then become a certified Excel VBA specialist, and I must say, in my years being a professional programmer, this is the skill that has set me apart from all other programmers around me. Sure, people can do Regular Expressions… So can I. Sure, people can do Object Orientation. So can I. But what programmer fancies dumb jumbling of data, and programming a language that has the word ‘basic’ in it? Right. Most programmers I knew were Linux shell companies (pun intended) that had no idea something good was hidden up the sleeve of Microsoft. But ok, sometimes I was able to show them some awesome things, and they would instantly recognize that their world view (Excell is for end-users) was a grave mistake.

What fun it was to teach people Excel, and more so, VBA! To me, it’s the tool of all tools, and it can greatly help anyone who ever works with data (ehm, anyone). So in this first part of a long, long series (I hope) I will show you, the humble ignorant user how to convert an Excel table to a list.

Why? Many times I have gone to companies and helped them with some particular problem. Usually it started out with an analyst/marketer/ceo showing me a bunch of data. The data was always presented as a table, with column headings and row headings, with the data in the middle. That seems like a nice way to present data, yes, it is in fact. The first thing I would do is then convert this table to an ugly list.

So why would you convert it to a list?

Because Excel is in love with lists. Excel craves lists, it’s like Access’ little brother, but it can speak five languages and juggle 4 balls. It’s no database tool (maximum of 65535 rows, 1M in Excel 2007)… but it can transform any list into a deep, deep analysis.

The way this analysis is done later on is with Pivot Tables. I wrote those capitals on purpose. Pivot Tables are so powerful that you can basically give it any data list and it can tell you what’s missing, what’s wrong, what’s unique, what’s the total, what’s the average, you name it. But more on that later on.

Let’s convert!

Warning: code ahead…

A table consists of three parts:

  1. The row headings (left)
  2. The column headings (top)
  3. The data (center)

We will loop through the data cell by cell, and create a row in a new list for each. That’s the basic idea (pun intended).

Before we start, we check some preconditions. We have to make sure that we are inside a set of data, formed into a table. All we do is just check if we have at least two rows and two columns (not the ultimate, but it works).

Sub TableToList()
If ActiveCell.CurrentRegion.Rows.Count < 2 Then
Exit Sub
End If
If ActiveCell.CurrentRegion.Columns.Count < 2 Then
Exit Sub
End If

Then we will need some variables to refer to the various sections of the table

Dim table As Range
Dim rngColHead As Range
Dim rngRowHead As Range
Dim rngData As Range
Dim rngCurrCell As Range

Next. we will need some variables for the data itself

Dim rowVal As Variant
Dim colVal As Variant
Dim val As Variant

Now, we will start pointing our variables to the data, row headings and column headings, like so

Set table = ActiveCell.CurrentRegion
Set rngColHead = table.Rows(1)
Set rngRowHead = table.Columns(1)
Set rngData = table.Offset(1, 1)
Set rngData = rngData.Resize(rngData.Rows.Count - 1, rngData.Columns.Count - 1)

Note that “currentregion” is a handy tool that expands any cell into a surrounding of non-empty cells. So this way your selected cell could be anywhere inside the table when you run the macro. The data part is a bit harder, line 4 and 5 together shift and resize the original table to form the right bottom part, where all the data resides.

ActiveWorkbook.WorkSheets.Add

Next, we create a new sheet in the workbook, to hold the list.

ActiveCell.Value = "Row#"
ActiveCell.Offset(0, 1).Value = "RowValue"
ActiveCell.Offset(0, 2).Value = "ColValue"
ActiveCell.Offset(0, 3).Value = "Data"
ActiveCell.Offset(1, 0).Select

In this sheet, we create a first row, “manually”, where we name the column headings for our list. These column headings are very important for sorting, analysis, pivot tables, export and such. The last statement instantly moves the current cell selection one row down. Notice we’re inserting a special column for Row Number. This is not always necessary, but it doesn’t hurt, and it helps you to always be able to restore the original order of the list.

Now it’s time for the actual grunt work, looping through the table

Dim n As Long
For Each rngCurrCell In rngData
colVal = rngColHead.Cells(rngCurrCell.Column - table.Column + 1)
rowVal = rngRowHead.Cells(rngCurrCell.Row - table.Row + 1)

The “for each rngCurrCell in” is a real beauty in VBA. It just runs through any selection, without worries of overflows, row and column numbers, or calculations. In the loop, we set the value of the current column and row. Note that the rngCurrCell.column and rngCurrCell.row are not relative, it’s the actual number of the column/row. So if the tables starts at C3, the first cel is having column=3 and row=3.

n = n + 1
ActiveCell.Value = n

Here, we upped counter ‘n’ and put it in the list.

ActiveCell.Offset(0, 1).Value = rowVal
ActiveCell.Offset(0, 2).Value = colVal
ActiveCell.Offset(0, 3).Value = rngCurrCell.Value
ActiveCell.Offset(1, 0).Select

We do the same trick again to put a new row in the data list on our new sheet. As you can see this part of code is repeated from the part where we created the header. A small improvement would be to create a function named ‘newRow(n, rv, cv, dv)’ to insert a new row with these values.

If, instead of actual values, you prefer to link to the original cell, you can use

ActiveCell.Offset(0, 3).Value = "=" & rngCurrCell.Worksheet.Name & "!" & rngCurrCell.Address

Finish the loop with:

Next
End Sub

Well, that’s it!

Running your code

Make sure to have a table setup in Excel, and click inside the table, it will be automatically selected.

  • Press ALT+F8
  • Select TableToList
  • Click Run

In Office 2003 you can add a shape to your worksheet, right click, and choose assign macro.

  1. Choose Tools > Customize
  2. Choose Macros > Custom menu item -> drag to toolbar
  3. Right click item
  4. Choose Assign macro…
  5. Choose TableToList

Since Office 2007 this option is not available anymore, but you can still right click the ribbon and choose ‘customize quick access toolbar’. From there you can pick the Macro’s category and add the macro.

A new sheet will be created. Take a look at the list. You can try sorting, filtering, analyzing, totalling, and… pivot tables. A pivot table is a dynamically updating table which automatically totals values from a list, and presents them in… a table. Here’s how to re-create the original table from the list:”

  1. Choose Data > Pivot Table
  2. Choose Finish
  3. Drag ColumnValues to the ‘column fields’
  4. Drag RowValues to the ‘row fields’
  5. Drag Data into ‘Drop data items here’

Voila, the list is back. That is, if it was a list of numbers. Pivot Tables are for numeric operations, if you had text in there, it won’t show anything (anything good).

Download

Download the file here:

Table2List.xla

How to install:

  1. Office 2003: to install an XLA you need to go to Tools > AddIns and select the file with the browse button. Make sure the checkbox is enabled.
  2. Office 2007: Click on the Office Button in the left top, then Excel Options, then Add-Ins. Now select Manage… Excel Addins and click Go. Again browse and enable the addin.

You may wish to copy the file to the suggested Add-Ins folder. If you are on a network and wish to share the add-in with others, make sure to keep it on a network drive.

Once the Add-in is enabled you will see a new button that runs the macro. In Office 2007 the button is under the Add-Ins ribbon tab. You can also press ALT+F8, then type ‘Table2List.xla!TableToList’. The macro will be hidden in the XLA file, so you cannot select it.

An even quicker solution

Abu Yahya (see comments) gave me an even quicker solution. All kudos go to him for this.

First, start the pivot table wizard. Now in Excel 2007 and up you may have a hard time finding it! So, right click the quick access toolbar (it’s the bar with tiny icons on the top). Then choose Customize… and select Choose commands from: Commands not in the ribbon. Now find “pivottable and pivotchart wizard” in the list, and add it to the list on the right. You will see a tiny pivottable icon in the toolbar now.

Go on and click that icon, and then:

Step 1. Choose multiple consolidation ranges
Step 2. Choose I will create the page fields
Step 2b. Select Range of the table then Add to
Step 3. Choose New Worksheet
Step 4. Click Finish
Step 5. on the new sheet – Pivot table field list –> uncheck [ ] Row and [ ] Column
Step 6. There will be one value exactly in your pivottable. Double click it.

You will now see a new sheet with a list built up of the columns Row, Column and Value.

Step 7 (optional). Create a pivottable from this list to analyze your data.

One important note: you can have exactly 1 field that will show up in your data next to your row and column fields. That field has to be in the far left column of the data you select before consolidating. If you need more data in your final analysis you can combine fields with the “&” operator, using a formula like e.g.in cell C2:  =A2 & B2.