Thursday, December 30, 2010

Analyzing text documents for phrases using Excel

This was really a question I found at excelforum.com; can you take a list of phrases in excel and calculate in a simple way how many times each term is used in a word document? The answer is yes!

First, you need to save the doc as a txt-file. Then import this into Excel using the text import wizard - import as delimited file without any delimiters defined (remove the checkmark before "tab"). Import this into column A of Sheet2.

Then, place the list of words in column A in Sheet 1. If the first term to look for is placed in cell A1, you put this in B1 to count the number of times the term is used in the text file:

=COUNTIF(Sheet2!A:A,CONCATENATE("*",A1,"*"))
For fun, let's test how many times a few terms occurs in this political article on cnn.com

Update: 
6stringjazzer at excelforums.com reminded me that one should put each word into its own cell to avoid missing counts when the same term is used several times in the same line.

Wednesday, December 29, 2010

Dynamic optimization in Excel?

I've been strugglig to solve a rather complex optimization problem lately. The equality constraints are defined partly as nonlinear differential equations, partly as algebraic equations of those time-varying variables solved for by the ODEs, and a bunch of inequality constraints. The degrees of freedom are functions of time. This is a so-called dynamic optimization problem. These problems are defined as follows;




Normally, you would solve these problems either using expensive software tailored to the purpose, or by discretizing the input u and using advanced optimization software to solve the resulting large nonlinear programming problem. This was tried, with the venerable l_bfgs_u algorithm available in SciPy. Well - it was slow and not very robust.

Excel to the rescue! The model was implemented using forward Euler with very short step length, resulting in an Excel model with 150.000 linked cells. The objective function was a quadratic function with 24000 terms. And the Excel solver was actually capable of handling this! Incredible!

That being said - this was solved using Excel 2004 for Mac. It seems the solver in this edition has not been completely stripped down, forcing you to buy the "professional" license from Frontline. Or maybe I am wrong on that - it did work to solve my problem.

Tuesday, December 21, 2010

Easy access to the format painter in Excel 2007

I tend to use the format painter in Excel when I am making the final touch to a spreadsheet before sharing it with anyone. I am, however, not very found of "clicking" on buttons to make this happen. Hence.... the search for a keyboard shortcut was started. There are several solutions - common to them is that they are not really time-saving, they are just awkward.For example, here is one:

  • First, use CTRL + c to copy the parent cell (including format). 
  • Then hit Alt+E,S,T,Enter to paste format only to the active cell.
Elegant? Think not....

Well, here's what I came up with: add the format painter to the quick access bar. (Right click it on the ribbon to access a context menu, choose "add to quick..."). 


Then, you can access the format painter by hitting the automatically assigned keyboard shortcut (depending on where you put the Format painter on the quick access bar). In my case, I reach it by hitting Alt+4. You find out by hovering your mouse over the quick access button for the painter. 

Saturday, December 11, 2010

Avoiding complex calculations in Excel

Have you ever received some data in an Excel sheet that you need to do some rather complicated computations on before you share results with other users? Sometimes, the analyses we need to do are outside the realm of a spreadsheet. Then, why not use a library that allows you to read Excel data into a data structure in your programming language of preference, do the computations, and then write back to the excel sheet? This can save you a lot of headache in maintenance and allow you to use more powerful libraries than those available in Excel. Here is an example with Python as the programming language.

First, install the package xlrd from here: http://www.lexicon.net/sjmachin/xlrd.htm. Then, you can read in data to Python, manipulate at hearts will and get your results without struggling with Excel limitations. Also install xlwt. How to use these packages is explained here: http://scienceoss.com/write-excel-files-with-python-using-xlwt/. These are excellent blog posts - don't want to rip them off, go to the source :-)

I use these packages to combine results from Excel calculations with output from other types of software, for re-entering results into Excel files for easy referencing later (and sharing with colleagues).

Thursday, October 14, 2010

Getting the name of your worksheet into a cell - without using VBA!


VBAHave you ever needed to put the name of a worksheet into a cell? Of course, you can just type in the name. What if, however, you have 250 sheets with different names? That is an awful lot of typing. Formulas to the rescue! Next time, try this neat little trick...

Create a new workbook and save it with some name, for example test.xls. Then, create a nice name for your worksheet. Now, here's the magic formula:

=MID(CELL("filename",$A$1),FIND("]",CELL("filename",$A$1))+1,255)

Hm.... does it work?


Apparently it does!

So... what's the magic?

The function "CELL" returns information about a given cell, here we have asked for the filename - this one includes the sheet name as well:

hdisk:Users:username:path:[test.xls]My Fantastic Sheet Name

Now, the function "MID" returns the internal text starting from a certain character number and going on until a finish number. Here, the number we start at is the one after the "]" sign in the text string returned from cell. This number is found using the "FIND" function. 

Pretty nifty, eh? I'm sure this trick has saved me many hours of typing over the years!