Archive for March, 2009|Monthly archive page
How to shuffle an array
This is a very common programming problem, e.g. when you wish to show some images in random order, when you want to show a random quote, etc. I will show you a solution in JavaScript, but it can be ported to other languages easily.
The common solution to shuffling is to swap random elements, but swapping means you have to constantly work with two elements, and it can be done by using only one. Also, it’s common to randomly swap e.g. 1000 elements, but that wouldn’t work well for very large arrays.
All you have to do is follow a couple of simple steps:
- First, of course you start with a simple array, which you have in a specified order.
- You place a random element in the output array, and remove the element
- Repeat step 2 until the array is empty
Here’s the full code
function shuffle(r) {
var pos;
var out = [];
while (r.length > 0) {
pos = parseInt(Math.random()*r.length);
out.push(r[pos]);
r = r.slice(0,pos).concat(r.slice(pos+1, r.length));
}
return out;
}
Let’s examine the parts. First, the variables pos and out are defined, and out is initialized to an empty array.
var pos; var out = [];
No we “loop” through the array. But, do note that this is not a real loop, in fact we’re constantly going to remove elements until the array is empty. So the simple check on length is enough here.
while (r.length > 0) {
We find a random element. For large arrays the Math.random() method can be considered unreliable, but then I mean really large.
pos = parseInt(Math.random()*r.length);
Next, we add the random element to the output array, using push. Then we remove the element from the original array. This is done in three parts:
- get everything on the left of the element
- get everything on the right of the element
- concat these two arrays to form a new array
That is a really difficult way of removing just one element, but deleting an element from an array is not a native javascript method. John Resig wrote another version of delete…
out.push(r[pos]); r = r.slice(0,pos).concat(r.slice(pos+1, r.length));
And last but not least we return the new array:
} return out; }
Example
Here’s an example of calling the function, shuffling 52 integers
var r=[];
i=52;
while (i--)
r.push(i);
alert(shuffle(r).join(','));
Extending Array
If you feel so inclined you can make it an extension of the Array object in javascript, like so:
Array.prototype.shuffle = function() {
var r=this;
... //rest of code
}
Note the additional "r=this" line.
Drawbacks
There are some drawbacks and warnings to take note of:
- It can be slow for huge arrays, I would recommend it for arrays under 1000 items
- It takes up additional space, since it creates a new array
- The delete can be improved, by using e.g. John Resig’s version
A shorter ‘hack’
The following is a short hack that can most definitely suit one time needs. It’s a form of sorting randomly, which sounds weird, and it is, but it works. However, if the random value is not seeded again, the next time you run it you end up with the exact same sequence.
function shuffle(r) {
r.sort(function() { return Math.random() } );
}
For more background see e.g. the Fisher Yates shuffle method (which by the way this is not, but this is).
Javascript floating point fix
Javascript has an annoying bug feature, and it’s the wrong result of simple floating point calculations. Try this:
alert(0.1+0.2);
Surprise! The result is 0.30000000000000004.
This can be quite annoying when programming timers, and doing math of course.
Why is this? It’s because floating points work that way, it’s by design. That’s not really a flaw, it’s just to make the process real fast. Just like your pocket calculator has only 8 digits, which isn’t wrong per se, but it’s never fully mathematically correct. Sun Microsystems has some real hardcore background on the floating point issue.
But for us simple programmers, all we want is a fix, right?
Here is one at lars-sh.de. Or download the Math Library directly. Kudos to Lars Knickrehm.
Convert Excel Tables To Lists
In a dark past I was an Excel instructor (among other things). I have trained countless people in the art of Excel number wizardry. I have then become a certified Excel VBA specialist, and I must say, in my years being a professional programmer, this is the skill that has set me apart from all other programmers around me. Sure, people can do Regular Expressions… So can I. Sure, people can do Object Orientation. So can I. But what programmer fancies dumb jumbling of data, and programming a language that has the word ‘basic’ in it? Right. Most programmers I knew were Linux shell companies (pun intended) that had no idea something good was hidden up the sleeve of Microsoft. But ok, sometimes I was able to show them some awesome things, and they would instantly recognize that their world view (Excell is for end-users) was a grave mistake.
What fun it was to teach people Excel, and more so, VBA! To me, it’s the tool of all tools, and it can greatly help anyone who ever works with data (ehm, anyone). So in this first part of a long, long series (I hope) I will show you, the humble ignorant user how to convert an Excel table to a list.
Why? Many times I have gone to companies and helped them with some particular problem. Usually it started out with an analyst/marketer/ceo showing me a bunch of data. The data was always presented as a table, with column headings and row headings, with the data in the middle. That seems like a nice way to present data, yes, it is in fact. The first thing I would do is then convert this table to an ugly list.
So why would you convert it to a list?
Because Excel is in love with lists. Excel craves lists, it’s like Access’ little brother, but it can speak five languages and juggle 4 balls. It’s no database tool (maximum of 65535 rows, 1M in Excel 2007)… but it can transform any list into a deep, deep analysis.
The way this analysis is done later on is with Pivot Tables. I wrote those capitals on purpose. Pivot Tables are so powerful that you can basically give it any data list and it can tell you what’s missing, what’s wrong, what’s unique, what’s the total, what’s the average, you name it. But more on that later on.
Let’s convert!
Warning: code ahead…
A table consists of three parts:
- The row headings (left)
- The column headings (top)
- The data (center)
We will loop through the data cell by cell, and create a row in a new list for each. That’s the basic idea (pun intended).
Before we start, we check some preconditions. We have to make sure that we are inside a set of data, formed into a table. All we do is just check if we have at least two rows and two columns (not the ultimate, but it works).
Sub TableToList() If ActiveCell.CurrentRegion.Rows.Count < 2 Then Exit Sub End If If ActiveCell.CurrentRegion.Columns.Count < 2 Then Exit Sub End If
Then we will need some variables to refer to the various sections of the table
Dim table As Range Dim rngColHead As Range Dim rngRowHead As Range Dim rngData As Range Dim rngCurrCell As Range
Next. we will need some variables for the data itself
Dim rowVal As Variant Dim colVal As Variant Dim val As Variant
Now, we will start pointing our variables to the data, row headings and column headings, like so
Set table = ActiveCell.CurrentRegion Set rngColHead = table.Rows(1) Set rngRowHead = table.Columns(1) Set rngData = table.Offset(1, 1) Set rngData = rngData.Resize(rngData.Rows.Count - 1, rngData.Columns.Count - 1)
Note that “currentregion” is a handy tool that expands any cell into a surrounding of non-empty cells. So this way your selected cell could be anywhere inside the table when you run the macro. The data part is a bit harder, line 4 and 5 together shift and resize the original table to form the right bottom part, where all the data resides.
ActiveWorkbook.WorkSheets.Add
Next, we create a new sheet in the workbook, to hold the list.
ActiveCell.Value = "Row#" ActiveCell.Offset(0, 1).Value = "RowValue" ActiveCell.Offset(0, 2).Value = "ColValue" ActiveCell.Offset(0, 3).Value = "Data" ActiveCell.Offset(1, 0).Select
In this sheet, we create a first row, “manually”, where we name the column headings for our list. These column headings are very important for sorting, analysis, pivot tables, export and such. The last statement instantly moves the current cell selection one row down. Notice we’re inserting a special column for Row Number. This is not always necessary, but it doesn’t hurt, and it helps you to always be able to restore the original order of the list.
Now it’s time for the actual grunt work, looping through the table
Dim n As Long For Each rngCurrCell In rngData colVal = rngColHead.Cells(rngCurrCell.Column - table.Column + 1) rowVal = rngRowHead.Cells(rngCurrCell.Row - table.Row + 1)
The “for each rngCurrCell in” is a real beauty in VBA. It just runs through any selection, without worries of overflows, row and column numbers, or calculations. In the loop, we set the value of the current column and row. Note that the rngCurrCell.column and rngCurrCell.row are not relative, it’s the actual number of the column/row. So if the tables starts at C3, the first cel is having column=3 and row=3.
n = n + 1 ActiveCell.Value = n
Here, we upped counter ‘n’ and put it in the list.
ActiveCell.Offset(0, 1).Value = rowVal ActiveCell.Offset(0, 2).Value = colVal ActiveCell.Offset(0, 3).Value = rngCurrCell.Value ActiveCell.Offset(1, 0).Select
We do the same trick again to put a new row in the data list on our new sheet. As you can see this part of code is repeated from the part where we created the header. A small improvement would be to create a function named ‘newRow(n, rv, cv, dv)’ to insert a new row with these values.
If, instead of actual values, you prefer to link to the original cell, you can use
ActiveCell.Offset(0, 3).Value = "=" & rngCurrCell.Worksheet.Name & "!" & rngCurrCell.Address
Finish the loop with:
Next End Sub
Well, that’s it!
Running your code
Make sure to have a table setup in Excel, and click inside the table, it will be automatically selected.
- Press ALT+F8
- Select TableToList
- Click Run
In Office 2003 you can add a shape to your worksheet, right click, and choose assign macro.
- Choose Tools > Customize
- Choose Macros > Custom menu item -> drag to toolbar
- Right click item
- Choose Assign macro…
- Choose TableToList
Since Office 2007 this option is not available anymore, but you can still right click the ribbon and choose ‘customize quick access toolbar’. From there you can pick the Macro’s category and add the macro.
A new sheet will be created. Take a look at the list. You can try sorting, filtering, analyzing, totalling, and… pivot tables. A pivot table is a dynamically updating table which automatically totals values from a list, and presents them in… a table. Here’s how to re-create the original table from the list:”
- Choose Data > Pivot Table
- Choose Finish
- Drag ColumnValues to the ‘column fields’
- Drag RowValues to the ‘row fields’
- Drag Data into ‘Drop data items here’
Voila, the list is back. That is, if it was a list of numbers. Pivot Tables are for numeric operations, if you had text in there, it won’t show anything (anything good).
Download
Download the file here:
How to install:
- Office 2003: to install an XLA you need to go to Tools > AddIns and select the file with the browse button. Make sure the checkbox is enabled.
- Office 2007: Click on the Office Button in the left top, then Excel Options, then Add-Ins. Now select Manage… Excel Addins and click Go. Again browse and enable the addin.
You may wish to copy the file to the suggested Add-Ins folder. If you are on a network and wish to share the add-in with others, make sure to keep it on a network drive.
Once the Add-in is enabled you will see a new button that runs the macro. In Office 2007 the button is under the Add-Ins ribbon tab. You can also press ALT+F8, then type ‘Table2List.xla!TableToList’. The macro will be hidden in the XLA file, so you cannot select it.
Comments (1)
