Archive for the ‘software’ Category

Permutations with repetition using Excel

The back story

Recently I have been trying to limit my use of VBA. It’s not that I was addicted and needed to go to VBA-rehab, on the contrary, I still love VBA till death do us part. But the thing is, every time someone opens an Excel sheet with VBA macros they are reminded of Excel’s vulnerability and the risks of macros.

So I set out to make workbooks that do the same thing, but without VBA. Not always is this possible, or efficient to do so. But when it’s possible, it also comes with great performance and great stability. No code needs to be changed, ever. Of course there are also downsides. It’s not as flexible as VBA, so you’re stuck in a rigid framework that solves one thing and one thing only. But it does it so well, oh my.

My latest endeavor was with permutations. I needed something that would generate all permutations of the tokens F,C,R,A (don’t ask) with repetition. As some know I am an avid speed cuber, that is solving the Rubik’s cube for speed. And the Rubik’s cube is a permutation puzzle. So I have dealt with permutations quite a lot. For those who haven’t paid attention in math class: you should know that permutations come in two flavors: with repetition (Pr) and without repetition (P). The number of permutations (Pr) in these four tokens F,C,R,A is 4^4, or 256. That number is exponential, so it grows so fast that at 5 tokens you are at 3,125 permutations and at 6 tokens at 46,656 permutations. At a set size of 10 tokens you are at 10^10 or 10,000,000,000 (ten billion) permutations. VBA would surely choke on that number of statements to follow. Excel can handle 1 million rows, although I wouldn’t put it to the test with that.

A clean slate

I started out writing a VBA program, that generates all permutations (Pr). The funny thing is if you go online and expect to find a bunch of worked out examples of algorithms, you don’t. Almost all examples you find are about permutations without repetition, which is like working with real objects, since you can’t duplicate real objects. The lottery is a good example of this. The program worked, but it was slow, and cumbersome. So let’s drop VBA and try it without.

CaptureFirst, I set out with some settings (sheet ‘settings’). We define the pattern, and calculate the size of the set (using LEN), and the number of possible permutations (using Length^Length). These values will be used extensively in the functions to generate permutations.

The maximum length of the pattern is set to 5, which totals 3,125 permutations, an amount which Excel can handle in the blink of an eye. You could extend the grid to a width of 7, which would come to 823,543 permutations. I’d be interested to know how fast Excel would generate the output, and how big the file would become. If you try it out please let me know.

Formula frenzy

CaptureGo to the output sheet and look at the grid on the right. There you see in the top row a couple of simple formulas. They are to set the repeat cycle of that column. In the first column you see a token repeated once, in the second column 4 times, in the third column 16 times. We’re multiplying by Length to set the repeat cycle. This is an easy way to generate permutations of any set. Think of how it works with regular counting. You start with 0,1,2,3,4,5,6,7,8,9, and then go one digit to the left, and you repeat, but now in cycles of 10, so that 10,11,12,13,14,15,16,17,18,19 has exactly 10 times a 1. We’re using the same principle.

Our basic generating formula is as follows:


=MID(Pattern,MOD($D4/E$3,Length)+1,1)

If you don’t know MOD, this is a function for modulo, also called the remainder after performing division. The MID function gets a character from a specified position in the string. The column with N is simply to count and use the MOD function properly.

By using absolute referencing we are now able to copy the cell E4 to all other cells in the grid, while keeping a properly working formula. With a pattern of length 4 we can ignore the last column, which is only needed for a pattern of length 5.

In the settings sheet you will see a width and a height. By selecting the range starting at E4 with that width and height, we get exactly all permutations in the set.

Alternatively, on the left, there is a table with all tokens concatenated in one string, for ease of use. The formula for this is:


=IF(D4>=NumItems,"",LEFT(CONCATENATE(E4,F4,G4,H4,I4),Length))

Using FCRA as a pattern we can now see all 256 permutations in column A!

Perfect permutations

CaptureWell, permutations without repetition are actually a subset of permutations with repetition (P < Pr). In a permutation without repetition you don’t have any duplicates. So for the tokens F,C,R,A a valid Pr would be FFFF, but it’s not a member of P. You can only get a member of P by swapping original tokens. So e.g. FRCA is a member of P. That’s why the number of items in P doesn’t grow as fast as in Pr. Four tokens gets to 4*3*2*1=24 permutations. This is called a factorial.

If a token set does not contain duplicates we can easily filter out the permutations we need. E.g. in the set A,A,B,B we still get duplicates in the list, and so we can’t filter. But in F,C,R,A it’s quite possible using Excel. The formula used is a but difficult though, and requires some thought:


=SUM(IF(FREQUENCY(
  MATCH(OFFSET(E4,0,0,1,Length),OFFSET(E4,0,0,1,Length),0),
  MATCH(OFFSET(E4,0,0,1,Length),OFFSET(E4,0,0,1,Length),0))
  >0,1))=Length

Here, OFFSET gives us a dynamically defined range, which is handy, because we don’t know how long the pattern is beforehand. In this formula, Length is the size of the set (aka the length of the string). Both OFFSET and MATCH return multiple values, so it’s impossible to split the formula into more cells, but just for clarity, let’s view it in condensed form:


=SUM(IF(FREQUENCY({FFFA}, {FA})>0,1))=Length

What it does is it totals the frequencies of each character in the set, so in this case it returns 2, and then checks to see if it matches the length (4). If it matches we have a permutation. Note this only works for patterns that have no repeating tokens, like FCRA.

Using named formulas we can simplify the long formula to:


=SUM(IF(FREQUENCY(
  MATCH(tokens,tokens,0),
  MATCH(tokens,tokens,0))>0,1))=Length

…where tokens equals OFFSET(E4,0,0,1,Length)

Now we have a formula to detect permutations. Unfortunately we still have duplicates, because our table always has 5 tokens and we might have a shorter pattern, like our example FCRA. So we use an IF to detect empty cells in column A and we can now use Excel’s filter (On the ribbon choose Data, then Filter) to get all permutations.

Download

permutations with repetition (320KB)

Using named ranges and worksheet functions in Excel VBA

Bold Brackets

A couple of days ago I saw something in an article on StackOverflow, that blew my mind. I can’t find the article anymore, but I do remember what this one neat trick was, that will for ever change your VBA. It’s called a named range, and I found out I had always been doing it wrong. So have you, most likely.

Have you ever written something like this?


s = Application.WorksheetFunction.Sum(Range("A1:A10"))

You thought you were quite smart, using SUM to add some values together that would have taken a loop in VBA. You petted yourself on the back, took a beer, and applauded yourself for you being awesome. Well, you’re not awesome. This is lame. You suck. Ok, maybe not, but watch this:


s = [Sum(A1:A10)]

It’s incredible! This gives the exact same result.  You may wish to prepend with a sheet name, so it’s an exact reference. You can use any kind of name inside the square brackets. So, if A1:A10 is named ‘records’ in Excel you could rewrite this code to


s = [Sum(records)]

Note that with this notation we don’t use double quotes around the name of the range.

Vanishing Variables

CaptureA quite mighty use for this, is that we can now write code with a lot less variables, if we let Excel do the work for us.

  1. make a new sheet
  2. name it ‘variables’
  3. make three columns: name, value, description

Now you can fill the table you just made with all kinds of settings, constants or calculations that you want to use in your elaborate VBA program.

I am a lazy teacher, and I have a lot of students, so I work a lot with short macros that can help me get more spare time. Here’s an example:

    For Each subfolder In FSfolder.SubFolders
        If subfolder Like "Student *" Then
            [studentNr] = Right(subfolder, 6)
            If Not FileExists([sClass]) Then
                FS.createFolder [sClass]
            End If
            FS.MoveFolder subfolder, [newFolder]
        End If
    Next subfolder

In line 3 I fill the named range ‘studentNr’ with a value taken from a folder, which contains a student number. In the next line, I check if a folder for that students’ class has already been made, and if not, I make the folder. The variable [sClass] does not exist in my code. It only exists in my variable table in Excel. The cell contains a VLookup function to find in which class this student is currently enrolled. Similarly I have a [Teacher] variable, also with a VLookup function. NewFolder is simply a concatenation and formatting, which is also easily done in Excel.

As you can see, the values for [sClass], [Teacher] and [newFolder] are filled automatically by Excel, and I don’t have to process anything.

Programming like this in Excel is a new paradigm. You don’t churn out all your code top-to-bottom as you used to. You create sheets with lots of calculations, lookups etc, and then you make a tiny program that links all this together. Excel can do some crazy fast, complex stuff, and you should never have to program those anymore!

Crazy Caveats

Well wasn’t that incredible? You may not be used to programming like this, and I recommend this method only for experts. You should be in full control of the worksheets, or otherwise someone will mess with your program. Also, when you are part of a team, you should make sure this ‘magic’ is elaborately documented in the code (e.g. in a header of the function mention which Excel named ranges are used).

Good luck.

Learn more: FastExcel Blog

 

How to fix the document map in MS Word 2007

The Document Map! That horrendous, horrible, horrific horror! It never does what I ask it to, it never shows what I tell it to show, it constantly messes itself up beyond recognition. It’s like a crazy guy beating himself up, I keep thinking of Fight Club. The first rule of Document Map is you don’t talk about Document Map. And yet, I am going to talk about it.

Today I suddenly found out the magic trick, the silver bullet that fixes the problem that had haunted me for so long.

As any professional MS Word user, I use the map extensively, even though its quirks pester me constantly. It’s just to powerful to let go.

What are its problems?

  1. Outline works, as long as you have very strict usage of headings. A lot of documents fail this, especially if they aren’t created by yours truly (ahem).
  2. Titles show no outline numbering, even though it’s definitely there. This is more an annoyance then an outright problem.
  3. The map is not updated. Grrrrr.

It appears Microsoft has put some fuzzy logic in the document map. Now most of the time I don’t really care about MS’s guess at what I want to do. If I press TAB then most of the time I actually want an indent. If I type ‘Teh’ I do mean ‘The’. And of course, I know how to disable that behavior.

However, the document map seems to magically create non-existing headings from my text, and not in the farthest corners of options and settings is there mentioning of the document map doing this. It’s like an easter egg, but not a very funny one.

And then I did what I often do when cleaning up documents. I pressed “CTRL+Q+SPACE”. If you don’t know this trick, it’s the ultimate cleanup action. It removes all and every formatting. I had the map open because I was about to do some things with headings when I noticed after pressing ‘CTRL+Q’ that the item disappeared from the map, as it should because it wasn’t a heading! Then, by not continuing with CTRL+SPACE I retained the original formatting (bold in this case).

Before you select all text and press CTRL+SPACE remember it wil turn lists into regular paragraphs, so make sure to just do this on the items that need to be removed from the map. If that still doesn’t work continue to press CTRL+SPACE. You will lose formatting, which you should recreate using styles. And if that fails make sure to press the Normal style button to revert to the default formatting.

The second problem appeared to actually be a bug. A heading right after a page break will not show its numbering. Well, that’s minor and you can always select ‘page break before’ in the style settings.

The third was related to both 1 and 2, perhaps it won’t happen again. I hope so, I really do.

Try it out and let me know of any other problems you encounter with the document map.

Update: someone created a macro for fixing the document map!

A physics model of a physics model

One of the hardest topics in school next to computer science is physics. Time and again I see students struggle with the topics, and I have to admit I didn’t find it easy in high school myself. So I started studying physics again using Khan Academy, and gained much insight into the basics. I have admired Sal Khan since he started his site, and as a school teacher I also think I should help him, and my students, wherever possible.

I started drawing a diagram of how all things in physics connected to each other, and while doing that, I realized there should be software to do this better. After a short search I found the wonderful D3 toolkit, based on web standards, just as I like it. D3 can create awesome diagrams, graphs, infographics, you name it.

So now I have for your pleasure and learning aid a physics based overview of common physics quantities. Click the image to see the actual animated model.

Some tips on how to use it:

  1. Click the diagram to automatically redraw it
  2. Drag nodes to change the layout
  3. Mouseover a node to see related info
force diagram physics

force diagram physics

Use in physics assignments

The colors in the diagram denote related quantities:

  • blue – essential axiomatic units m, l, t
  • green – related to movement
  • orange – related to energy
  • red – related to a single unit

To use this diagram in your physics assignment, look at the known variables. Find those in the diagram. Now see if you can combine those to make more variables: follow the arrows. If you can start from two known values you can usually calculate a third. You continue doing this until you have found the desired outcome.

Salman Khan is the smartest man in the world

The world is changing. At some point in the future, says Ray Kurzweil, computers will be more intelligent than humans. He thinks it will happen in the year 2045 to be exact. That’s a scary thought, but he could be right. Many say he is the smartest man in the world.

But today I realized that Salman Khan is ahead of him. Khanacademy is an online tutorial for everything including math, biology, economics, chemistry, physics, history, and more. And Khan made all videos himself.

But not only did he do that, he is expanding at an exponential rate, to offer his courses to classrooms. And that could revolutionize education.

And all of that… for free, because he is filthy rich anyway, and doesn’t care about making more.

World changing: Khanacademy. See for yourself.

Why I don’t like the Facebook Like button

thumbs up

thumbs up

First of all, I hate the “Like” button. However, I have to be honest, I hated the “fan” button even more. Now I am not just being negative. Let me give you some background on this.

The Fan button

The first time I saw the “become a fan of ….” button on Facebook I didn’t know what to think of it. I started thinking, OK, so what am I a fan of. I soon realized it’s a very short list, that also changes over time: Rubik’s cube, Aikido, House, Monk, Rush, Led Zeppelin, and some others some to mind. But to say I am a fan of e.g. Harry Potter is quite exaggerated. So, reluctantly I clicked the fan button for “House” (the TV show, not just any house). The next thing I knew I found myself drowned in messages about (from?) my temporary favorite TV show. It’s just plain stupid. I do want people to know I like the show, but not get stupid messages all the time from it. So I stopped being a fan of anything else.

The Like button

I was happily surprised when Facebook decided to change the Fan idea to a “Like” idea, thinking they must have heard some voices echoing my thoughts. Unfortunately though, the button still does the exact same thing: squirting out a barrage of nonsense messages. I soon found out though that now it’s quite easy to hide those messages without losing your ‘like’ status.

Google Reader

Another thing I am fan of is Google Reader. Ever since I got a new PC I never felt like installing applications. I realized that even though compiled software is faster, and sometimes has really nice features, using online software just works anywhere, and is good enough for me. Also I have been moving around between computers more, so it becomes quite necessary to synchronize application status. But that’s another story.

Google Reader has had a “like” button perhaps even longer than Facebook (can anyone tell me if that’s true?). And since Google Reader doesn’t followup the click of a “like” with a barrage of nonsense messsages, I was happy to click “like”. After a while though, I found out that on the top of my ‘suggested reading list’ were a hole bunch of really stupid posts. I thought: “there must be something wrong with Google’s algorithm”. And then it hit me: Google uses the “like” button to learn what you want to read. But I had been using it all wrong. I clicked “like” usually when something was really, really funny. And so Google decided I was mostly interested in humor. And that’s pure BS. I like a joke once in a while, but not all the time, when there’s interesting serious news ahead. And it was impossible for Google to tell what I found really hilarious and what I found just plain stupid (a sense of humor is not a computer’s strong point).

So I revised my like-button-decision-making-process and was happily surprised when only weeks later Google had adapted itself back to a more serious note, and I could suffice with reading the top 50 or 100 items, and send the rest of the news to a peaceful death.

Still, I was faced with another issue: once in a while I read interesting news about natural disasters. But I found it morally appalling to mention to Google that I liked the fact that over 200,000 people died in Chili, or that dozens died in a freak accident. But I learned the hard way now it’s wrong to think that way, we have to redefine the “like” button to an “adjust-neural-net-weight”-button, even though that sounds a lot more complicated.

Now what’s next?

I am going to continue clicking ‘like’ for everything that I wish to associate myself with, or wish to receive messages about in the near future. My advice to readers: please realize that there is a commercial incentive to all this ‘liking': ads will always be tailored to the user, wherever possible. And Facebook nor Google can do this properly when it doesn’t know what you ‘like’.

So far, I like where it’s going, though the wording is wrong in every single way.

How to ruin your website in 10 steps

I have worked for almost 10 years in web development now, and here’s what I have learned from the clients: you can ruin your website too!

Here’s how:
1. Get the cheapest designer you can get. If your nephew says he can do it, let him do it. It’s not that hard.
2. That being said, you can also do it yourself. Why pay for some fancy designer when you can buy or get a template. And HTML+CSS is easy, right?
3. Even cheaper, copy someone else’s site and just change the text.
4. Don’t start a forum or community. There are a lot of people out there ready to say nasty things about you.
5. You should get a CMS if possible. Then you can control everything yourself real easy, and choose any font and color. Did you know with a CMS you can just copy straight from MS Word? You can make sure people notice your links by using a big red font.
6. Don’t listen to any advice from your designer. They’re just after your money, not to help you.
7. If it looks good in Internet Explorer, then you’re done. There are so many browsers there’s no keeping up with it, and most people use IE anyway.
8. Almost everyone has broadband, so keep your images in high quality. Browsers resize them anyway when they display them.
9. Tables are cool. With tables you can get layouts done that are really hard without them.
10. Get as much information from your visitors as possible. When they sign up, ask for name, gender, location, email, interests. You can even sell that information later and make a good profit.

And all of the above will help you to get the lowest possible position on Google. But there are companies that can help you to become #1, just wait for the ‘offers’ to come in.

Adrian Monk and the case of the crippled DVD driver

Finally solved all computer problems, where nobody else could. After having Infoman have a look at my PC they said it was a virus, and it was repaired, but the problem (sudden reboots) kept occurring. In addition, my DVD player didn’t work anymore. I opened the case and found a dangling cable end, almost touching the motherboard. I think when the fan started blowing, it would sometimes short-circuit the MOBO (it’s a guess still). Anyway I secured it and the problem is now gone. 2 weeks, no reboot.

The broken CD player was a problem with an application named ‘HotSpot Shield’. Actually a pretty good app that enables you to access all kinds of sites that are blocked in your country (e.g. Veoh, Hulu, some YouTube vids). But, the thing came with very annoying ads so I uninstalled it. A couple of days ago I bought a wireless router and setup a network. All went well, but the connection dropped suddenly. I checked the TCP/IP stack and saw ‘Hotspot Shield’ as a layer in there. Damn! I said: “He’s the guy”.

Adrian Monk

Adrian Monk : "here's what happened"

Here’s what happened (Adrian Monk style): “At first we thought it was maybe a virus causing this, but cleaning with plenty of scanners didn’t resolve it. The next suspect was the hard disk, but even chckdsk /r didn’t solve the problems. Then we suspected the drivers themselves and reinstalled and updated all drivers. That didn’t work. But then we found HotSpot leaving a trace in network settings. Hotspot Shield needed to intercept all traffic in order to make sure it looked like it was coming from the US. So it put itself not only in Program Files, but also as an add-on in browsers, and as a layer in TCP/IP. That interfered with the built-in modem driver, which then failed. The modem driver looked for another resource and used one that conflicted with the DVD drive. After removing HotSpot from all network settings and uninstalling the DVD driver, it came back automatically and started responding again. Patient saved, and HotSpot put away.”

The Real Sixth Sense – digital meets real life

While Microsoft is playing with their $10,000 surface table, MIT creates a $300 alternative that does more. It’s a wearable device aptly named Sixth Sense that lets you interact with the digital world, in awesome ways. Just watch…

5 breakthrough technologies that will go mainstream in 5 years

This is my list of 5 technologies that you probably don’t know about yet, and that will go mainstream in less than 5 years. I’d like to check back here in 5 years to see just how right (or wrong) I am.

1. Online rendered games

OTOY delivers server based live rendering. That is a cool thing that can enable live playing of console games that you don’t own. Of course it can also be used more professionally by e.g. architects and film makers.

video

2. Peer to peer hosting

Opera recently announced Opera Unite, their server-less website technology. Although it may not be Opera itself that brings this mainstream, it will certainly change the world. Currently websites are all hosted and delivered by servers not owned by ourselves. The middleman known as ISP will always be in between, charging for services, blocking stuff at will, and basically controlling our behavior on the internet. We (that is everyone) don’t like this, and it needs to change. P2P (peer-to-peer) technology like BitTorrent is key to this, but only supports file sharing. What is needed is web-sharing. Opera Unite is still very beta and techie, but once a “publish button” is available, that could well change. You don’t need Flickr to share pictures with friends, really. But we do need powerful technology like PHP, Java, .NET to be integrated. Javascript alone doesn’t make for the most interesting sites.

3. Augmented reality

It will take a bit longer than just software, but the world is advancing fast to a point where we can start using this. Simply glasses with small screens on the inside will work. GPS in your phone and netbook help to render relevant scenes laid over the exact location you are at. “Assisted reality” may be the first step: point your phone to a product and see it’s specifications, price comparison etc directly on your phone’s screen.

Wikipedia article

4. Flexible displays

This is another one that I envision to arrive on the market soon (a couple of years). You just roll up your e-reader, or roll it out of your netbook. It’s already for sale.

5. Online software development

Sites like topcoder, guru, but especially oDesk will become the trend. But the people you hire will not be unknown to you. Through social networking, virtual presence and web cams they will be much more part of the company then just outsiders. This will require a change in culture though. Currently people are quite ‘scared’ of showing their face to someone on the other side of the Ocean. But just like social networking it will grow quickly.

Things that will go unnoticed

  • social networks – ok, they’ve already made it, but they will not change drastically, and you notice a fatigue starting in everyone who has had the facebook rush for a while. The close circle of friends will continue, but 900 friends? C’mon. Facebook/twitter will end up in Google’s or Microsofts hands anyway.
  • Google wave – It’s the marriage of IM, picture album, bulletin board and email. I believe it actually will take over as the next version of GMail, but it will not be much more than that.
  • Wolfram Alpha – one of the latest hypes. Ultimately just one of the places people may go to to find answers. Will probably be bought by Google anyway.

I had no place for it on the list, and it may take more then 5 years, but solar power will also make it, eventually (within 10 years).

Follow

Get every new post delivered to your Inbox.