Archive for the ‘Excel’ Category
This is a super powerfeature I have always missed in Excel: the ability to join the content of cells without losing any data. Sorry, what’s that?? Yes, I know about the ‘merge’ option. It’s lame! Try merging two cells that both have content. Excel answers, delightfully happy, that you will only keep data in the first cell and lose all of the data of the other cells, and go deal with it.
So, I decided to ‘deal with it’ and I present here several macros for your pleasure and entertainment that will solve this once and for all.
Merging, Joining and Combining
Well, that’s all the same of course. Let’s start with a simple macro to join a bunch of cells and put the result in the first cell.
Sub Join() Dim out As String Dim c As Range out = "" For Each c In Selection.Cells out = out & c.Value c.ClearContents Next Selection.Cells(1, 1).Value = out End Sub
How does it work? First, we define the variable out which will contain all content. Then we loop over all cells in the current selection and we combine their values using the concatenation operator “&”. Finally we put the value of out in the first cell of the selection.
If you have never created a macro before, you can add them by pressing ALT+F11. After that, make sure to copy this one to the personal macro workbook.
In some occasions I needed to merge, but also keep the data separated by either commas or newlines. So I made a slightly modified version with arguments. Note for noobs: you can’t run these like normal macros, you’d have to go into the VB editor and start them from the immediate window. Or, you can make a short calling-macro similar to the JoinWithBreaks below.
' join the contents of a selection of cells ' and put all of these together in one single cell Sub Join(Optional s As String = "", Optional wrap As Boolean = True) Dim out As String Dim c As Range Dim n As Integer out = "" For Each c In Selection.Cells n = n + 1 out = out & c.Value If n < Selection.Cells.Count Then out = out & s End If c.ClearContents Next Selection.Cells(1, 1).Value = out Selection.Cells(1, 1).WrapText = wrap End Sub Sub JoinWithBreaks() Join vbNewLine, True End Sub
Another problem closely related to the merge problem is when you have a sheet looking like this:
And let’s say it extends a long long way to the bottom for your archive of ten years. To create a usable Excel table and e.g. to create a pivot table your table needs to have values in every cell, not just the first one. Humans can reason that cell A3 also belongs to January, but computers and thus Excel cannot. Clicking on the autofill handle in cell A2 fixes this, but doing so for 100 cells is still very tedious work. For that, we can use the following macro, which will automagically fill every empty cell with the value right above it.
Sub Filler() Dim c As Range For Each c In Selection.Cells If IsEmpty(c) And c.Row <> 1 Then c.FormulaR1C1 = c.Offset(-1, 0).FormulaR1C1 End If Next End Sub
What am I talking about? Client side “apps”, by which I mean browser based, server-less HTML pages, where everything is done on the client. Examples of such applications are CD Rom viewers, Touch Screen Console applications, and Information Display (like the train station screens).
Recently I got a request to make an Information Display. I started happily to look at how this would work using modern browser based techniques. After a couple of days of experimentation I must say that it was a wonderful journey, and I am convinced this technology will have a great future.
I have used the following technologies, and will explain hereafter how:
Excel 2007 and XML
Excel serves as my database. It consists of worksheets of tabular data. The data is exported to an XML file using the XMLTOOLS addin, which you can find on the microsoft site. It is very loosely designed: when I need an extra column, I insert it and start typing. I will have to re-generate an XML mapping then, and export the contents to an XML file. The advantage: the person working with this ‘database’ only needs to know Excel, and how to click on a few buttons.
jQuery and Cycle
I added jQuery most and for all for the Cycle plugin. It allowed me to create stunning visual transitions for ‘slides’. The slides are actually simple divs in an HTML page.
I used CSS3 for creating nice looking gradients. Also I used CSS3 for zebra tables and drop shadows on images. Overall it means there is no need whatsoever for images to enhance the visuals. I think that’s how future web development will and should occur.
SVG and/or Canvas
Ultimately I decided to use SVG for both my static vector based images (floor plan) and my animation (a clock). The SVG animation actually looked better than the canvas one and I wasn’t very interested in modifying the default look.
For the floor plan I hunted for a good SVG or Canvas editor. What I found was Google Docs (!). Recently they added a diagram editor, that can export to SVG. The resulting code unfortunately looks like hexadecimal soup, but the good thing is you can easily modify the diagram on Google Docs and export again.
Right now my app only works in FireFox 3.6, and that’s just fine, baby!!
Note: I cannot share the application since it’s made for our organization, but if you need help in setting up one yourself just add to the comments…
Any speedcuber knows how to calculate his or her average: total all, remove fastest and slowest, and divide. Here’s how you do that in Excel
=( SUM(A1:A12) – MIN(A1:A12) – MAX(A1:A12) ) / (COUNT(A1:A12) – 2)
But that is tedious of course, and your constantly changing the range to find the average of e.g. 5, or a running average. So instead I wrote a little VBA function you can put in a module.
- Press ALT+F11 to go to the editor
- Choose Insert > Module
- Choose Insert > Procedure
- Type CubeAVG
- Click Function
- Click OK
Now edit so it’s like the code below
' cubeavg : calculate speedcubing average Public Function cubeavg(r) Dim total As Double, fastest As Double, slowest As Double, n As Integer total = WorksheetFunction.Sum(r) fastest = WorksheetFunction.Min(r) slowest = WorksheetFunction.Max(r) n = WorksheetFunction.Count(r) - 2 cubeavg = (total - slowest - fastest) / n End Function
Don’t forget to save the file…
When creating SQL statements you’ll often need a date in the ISO 8601 standard format (e.g. 2010-03-26 12:34).
Of course you can change the format in Excel to show it as such, but that doesn’t give you the string you need, e.g. in an insert or update statement.
Here’s an Excel function to make an SQL date value, presuming the date value is in cell A1:
This circumvents the use of complicated IF and date/time functions. Append a “Z” if you need to indicate the timezone as UTC (i.e. GMT) time.
Here’s a short VBA function to create this type of date
Function SQLDate(d) SQLDate = WorksheetFunction.Text(d, "yyyy-mm-dd hh:MM:ss") End Function
Put this code in a new module in your workbook to instantly start using the function in Excel like this: “=SQLDate(A1)”
There are tons of ways to remove duplicates from a list of items, most of which are way too complicated and technical for a noob (if you don’t know what this is, then that’s you) to perform.
Let’s look at some of them, and let me know in the comments if you think these were useful. Here they come, in order of increasing difficulty (geekness)
TextPad is a free (well, nagware) text editor with so many built in tools I cannot talk about it without crying…
- open the file in TextPad
- select Tools > Sort
- check the box at ‘remove duplicate lines’
- click OK
Another program abused by millions (billions?) to do stuff that could be done with a 10 year old cellphone. What the bozos at Microsoft tell you is the dumbest way to do it, because you’re overwriting your original list. Here’s the smart way
Select the data
- Click Data > PivotTable… (Office 2003) or Insert > PivotTable (Office 2007)
- Click ‘Finish’ to create a new sheet with an empty pivot table
- Drag the column for which you need to remove duplicates into the left part of the pivottable
- adding flavor: you can now sort, filter, group, analyze, you name it.
If your data is in a database, and you have access to SQL, perform the following query:
- SELECT DISTINCT(column) FROM table
- in place of ‘column’ you type what you need to be unique
- in place of ‘table’ you type the table name
- tip: you can combine more columns by typing (column1, column2, …)
If you have the file on a linux or unix system, from the terminal (command line) type
- sort -u file > output
- where ‘file’ is the name of your file and ‘output’ is the name of the output file.
I’d like to know if you can come up with new and maybe even faster ways!
I have met a lot of people who know they ‘know’ Excel, even sometimes stating they are experts of some sort. Well, they were not, and I could usually spot it because they make one of the following ‘mistakes’. Of course they get the job done, but in the long run they usually got themselves into trouble with it. I used to be such a person myself, until I became a full time Excel and VBA trainer.
So here it goes, starting with the worst:
1. Using tables
Tables (like years on the left column months on the top row) are nice for a visual report, but the data cannot be used in any other way. If you’ve read my recent article about the TableToListConverter, you know why. Excel is an excellent analyzer of data, as long as this data is organized in rows (lists), much like a database.
2. Using cell references
Don’t you hate those formulas like
=C1*B6-H3+F5?. They’re horrible to work with. Wouldn’t it be easier to see =Rate*Hours-Discount+ExtraCost?
Well, that’s possible. Just click on the C1 in the top left corner and type the name ‘Rate’, then press Enter. From then on, C1 has an alias of ‘Rate’, and you can use it in formulas.
3. Using too many worksheets
Again too many people use too many worksheet. As above in #1, they create Worksheets named ‘jan’, ‘feb’, ‘mar’, ‘apr’ and so on. Later on it will become very difficult to get totals or filter this data. Forget about multiple worksheets and put everything in one sheet, as long as the data belongs together of course.
4. Using color to indicate meta data
Using a color to indicate something is great if you are the only person using the Excel file. That is hardly ever the case. You would have to explain: ‘well, red means this row is to be deleted, yellow means it’s not checked, and green means it is checked’. That’s nice, thank you, but the sheet looks butt ugly, and if I print this on black and white all that information is suddenly lost.
The correct way to indicate this information is to add another column or more columns next to your data. Give them descriptive heading values, like ‘deleted, checked’. Notice yellow and green can simply be combined by putting yes or no in the ‘checked’ column.
And did you know about the conditional formatting feature? Based on a value or even a formula you can color or format a cell. But, make sure to only do that for formatting that doesn’t relate to valuable information you should have visible in your sheet. It’s a good feature though for e.g. zebra-stripes (odd rows colored different from even rows).
5. Copy & Paste
Using copy and paste is fine, as long as you don’t copy when you don’t need to. A simple cell reference will point to value without needing a copy. So if you have a price in a C1 in one worksheet, and need it somewhere else, use
=C1, or of course
=Rate (see #2 in this list). Another way, which even works for many cells, is go to ‘Edit > Paste Special > Paste Link’.
6. Bad formatting
Excel is good at formatting data automatically, if you insert it in the ‘right’ way. E.g. try typing ’1-1′ and instead of ’0′ you will get ’1-jan’ depending on your date format. So the value is automatically converted to a date value. If you would type ‘January 1, 2009′, most likely Excel won’t recognize this and leave it as a text value. You cannot calculate with text values, but you can with Date values. So if A1 has ’1-jan’, and B1 has
=A1+1 then B1 will become ’2-jan’.
7. Too complex formulas
Of course, some formulas actually are quite complex, there’s no doubt about it. But there’s no need to make them look complex. Split your formulas in multiple parts, and use named ranges. So instead of
=Rate * Hours - Disount + ExtraCost you can use
=Rate * Hours in the Amount field and
=Amount - Discount + ExtraCost in the Total field.
8. Empty rows and columns
Empty rows are inserted to ‘make a sheet look nice’. However, by splitting e.g. January data and February data this way, Excel will assume you are dealing with two lists, not with one. The totals underneath using autosum will therefor work only on one of these sets. If you wish to have a bit more space, just drag the row height handle, or use Format > Row > Height to set a specific height.
9. Formatting for print
If you enlarge a font to make it look bigger on paper, you are making a mistake. Do it using File > Page Setup > Adjust to…
You can adjust to any size without changing the font. The same goes for displaying on the screen in fact. Use View > Zoom, or use the percentage dropdown box in the formatting toolbar.
10. Not using Pivot Tables.
I intentionally name it this way, since there are so many bad ways of analyzing data, getting totals, count, average and more on a set of data. But there’s only one way to do it extremely efficient, powerful and it brings lots more uses with it. Also, it keeps your data separate from your analysis. It’s called a pivot table. Make sure to have a list of data rows with headings, and then choose Data > Pivot Table. Now just click Finish right away instead of following the wizard. You can now drag in column heading names and be presented with a table with totals.
There’s so much you can do with a pivot table I am going to leave it to Microsoft to explain.
A very practical use of Pivot Tables is e.g. to get only unique values for a certain row. Since these are automatically grouped in a pivot table, you can simply copy and paste the list of unique values from there.
note: please see the update on the bottom of this article for an even quicker way to convert a table
In a dark past I was an Excel instructor (among other things). I have trained countless people in the art of Excel number wizardry. I have then become a certified Excel VBA specialist, and I must say, in my years being a professional programmer, this is the skill that has set me apart from all other programmers around me. Sure, people can do Regular Expressions… So can I. Sure, people can do Object Orientation. So can I. But what programmer fancies dumb jumbling of data, and programming a language that has the word ‘basic’ in it? Right. Most programmers I knew were Linux shell companies (pun intended) that had no idea something good was hidden up the sleeve of Microsoft. But ok, sometimes I was able to show them some awesome things, and they would instantly recognize that their world view (Excell is for end-users) was a grave mistake.
What fun it was to teach people Excel, and more so, VBA! To me, it’s the tool of all tools, and it can greatly help anyone who ever works with data (ehm, anyone). So in this first part of a long, long series (I hope) I will show you, the humble ignorant user how to convert an Excel table to a list.
Why? Many times I have gone to companies and helped them with some particular problem. Usually it started out with an analyst/marketer/ceo showing me a bunch of data. The data was always presented as a table, with column headings and row headings, with the data in the middle. That seems like a nice way to present data, yes, it is in fact. The first thing I would do is then convert this table to an ugly list.
So why would you convert it to a list?
Because Excel is in love with lists. Excel craves lists, it’s like Access’ little brother, but it can speak five languages and juggle 4 balls. It’s no database tool (maximum of 65535 rows, 1M in Excel 2007)… but it can transform any list into a deep, deep analysis.
The way this analysis is done later on is with Pivot Tables. I wrote those capitals on purpose. Pivot Tables are so powerful that you can basically give it any data list and it can tell you what’s missing, what’s wrong, what’s unique, what’s the total, what’s the average, you name it. But more on that later on.
Warning: code ahead…
A table consists of three parts:
- The row headings (left)
- The column headings (top)
- The data (center)
We will loop through the data cell by cell, and create a row in a new list for each. That’s the basic idea (pun intended).
Before we start, we check some preconditions. We have to make sure that we are inside a set of data, formed into a table. All we do is just check if we have at least two rows and two columns (not the ultimate, but it works).
Sub TableToList() If ActiveCell.CurrentRegion.Rows.Count < 2 Then Exit Sub End If If ActiveCell.CurrentRegion.Columns.Count < 2 Then Exit Sub End If
Then we will need some variables to refer to the various sections of the table
Dim table As Range Dim rngColHead As Range Dim rngRowHead As Range Dim rngData As Range Dim rngCurrCell As Range
Next. we will need some variables for the data itself
Dim rowVal As Variant Dim colVal As Variant Dim val As Variant
Now, we will start pointing our variables to the data, row headings and column headings, like so
Set table = ActiveCell.CurrentRegion Set rngColHead = table.Rows(1) Set rngRowHead = table.Columns(1) Set rngData = table.Offset(1, 1) Set rngData = rngData.Resize(rngData.Rows.Count - 1, rngData.Columns.Count - 1)
Note that “currentregion” is a handy tool that expands any cell into a surrounding of non-empty cells. So this way your selected cell could be anywhere inside the table when you run the macro. The data part is a bit harder, line 4 and 5 together shift and resize the original table to form the right bottom part, where all the data resides.
Next, we create a new sheet in the workbook, to hold the list.
ActiveCell.Value = "Row#" ActiveCell.Offset(0, 1).Value = "RowValue" ActiveCell.Offset(0, 2).Value = "ColValue" ActiveCell.Offset(0, 3).Value = "Data" ActiveCell.Offset(1, 0).Select
In this sheet, we create a first row, “manually”, where we name the column headings for our list. These column headings are very important for sorting, analysis, pivot tables, export and such. The last statement instantly moves the current cell selection one row down. Notice we’re inserting a special column for Row Number. This is not always necessary, but it doesn’t hurt, and it helps you to always be able to restore the original order of the list.
Now it’s time for the actual grunt work, looping through the table
Dim n As Long For Each rngCurrCell In rngData colVal = rngColHead.Cells(rngCurrCell.Column - table.Column + 1) rowVal = rngRowHead.Cells(rngCurrCell.Row - table.Row + 1)
The “for each rngCurrCell in” is a real beauty in VBA. It just runs through any selection, without worries of overflows, row and column numbers, or calculations. In the loop, we set the value of the current column and row. Note that the rngCurrCell.column and rngCurrCell.row are not relative, it’s the actual number of the column/row. So if the tables starts at C3, the first cel is having column=3 and row=3.
n = n + 1 ActiveCell.Value = n
Here, we upped counter ‘n’ and put it in the list.
ActiveCell.Offset(0, 1).Value = rowVal ActiveCell.Offset(0, 2).Value = colVal ActiveCell.Offset(0, 3).Value = rngCurrCell.Value ActiveCell.Offset(1, 0).Select
We do the same trick again to put a new row in the data list on our new sheet. As you can see this part of code is repeated from the part where we created the header. A small improvement would be to create a function named ‘newRow(n, rv, cv, dv)’ to insert a new row with these values.
If, instead of actual values, you prefer to link to the original cell, you can use
ActiveCell.Offset(0, 3).Value = "=" & rngCurrCell.Worksheet.Name & "!" & rngCurrCell.Address
Finish the loop with:
Next End Sub
Well, that’s it!
Running your code
Make sure to have a table setup in Excel, and click inside the table, it will be automatically selected.
- Press ALT+F8
- Select TableToList
- Click Run
In Office 2003 you can add a shape to your worksheet, right click, and choose assign macro.
- Choose Tools > Customize
- Choose Macros > Custom menu item -> drag to toolbar
- Right click item
- Choose Assign macro…
- Choose TableToList
Since Office 2007 this option is not available anymore, but you can still right click the ribbon and choose ‘customize quick access toolbar’. From there you can pick the Macro’s category and add the macro.
A new sheet will be created. Take a look at the list. You can try sorting, filtering, analyzing, totalling, and… pivot tables. A pivot table is a dynamically updating table which automatically totals values from a list, and presents them in… a table. Here’s how to re-create the original table from the list:”
- Choose Data > Pivot Table
- Choose Finish
- Drag ColumnValues to the ‘column fields’
- Drag RowValues to the ‘row fields’
- Drag Data into ‘Drop data items here’
Voila, the list is back. That is, if it was a list of numbers. Pivot Tables are for numeric operations, if you had text in there, it won’t show anything (anything good).
Download the file here:
How to install:
- Office 2003: to install an XLA you need to go to Tools > AddIns and select the file with the browse button. Make sure the checkbox is enabled.
- Office 2007: Click on the Office Button in the left top, then Excel Options, then Add-Ins. Now select Manage… Excel Addins and click Go. Again browse and enable the addin.
You may wish to copy the file to the suggested Add-Ins folder. If you are on a network and wish to share the add-in with others, make sure to keep it on a network drive.
Once the Add-in is enabled you will see a new button that runs the macro. In Office 2007 the button is under the Add-Ins ribbon tab. You can also press ALT+F8, then type ‘Table2List.xla!TableToList’. The macro will be hidden in the XLA file, so you cannot select it.
An even quicker solution
Abu Yahya (see comments) gave me an even quicker solution. All kudos go to him for this.
First, start the pivot table wizard. Now in Excel 2007 and up you may have a hard time finding it! So, right click the quick access toolbar (it’s the bar with tiny icons on the top). Then choose Customize… and select Choose commands from: Commands not in the ribbon. Now find “pivottable and pivotchart wizard” in the list, and add it to the list on the right. You will see a tiny pivottable icon in the toolbar now.
Go on and click that icon, and then:
Step 1. Choose multiple consolidation ranges
Step 2. Choose I will create the page fields
Step 2b. Select Range of the table then Add to
Step 3. Choose New Worksheet
Step 4. Click Finish
Step 5. on the new sheet – Pivot table field list –> uncheck [ ] Row and [ ] Column
Step 6. There will be one value exactly in your pivottable. Double click it.
You will now see a new sheet with a list built up of the columns Row, Column and Value.
Step 7 (optional). Create a pivottable from this list to analyze your data.
One important note: you can have exactly 1 field that will show up in your data next to your row and column fields. That field has to be in the far left column of the data you select before consolidating. If you need more data in your final analysis you can combine fields with the “&” operator, using a formula like e.g.in cell C2: =A2 & B2.