Excel gurus in the house?
Excel gurus in the house?
I have a problem I've been trying to work out and can't seem to come up with a good solution, was wondering if the collaborative hive mind here could help.
Here's the situation. I have a spreadsheet with over a half a million rows of data corresponding to approximately 42,000 separate items. I have three columns that contain data I need. A = item name. B = attribute name. C = attribute value. Currently thre are duplicates in column A, for example A could have item "abc123" listed multiple times with different values in B and C (such as B=color: C=blue, B=height; C=1, B=width; C=2, B=length; C=2.5, etc.)
I'm looking for a way to break each attribute name in column B into it's own column width the contents of C as the data in the cell. Can this been done with a few formulas?
With such a large amount of data doing this by hand could take me weeks, and I'm trying to avoid that.
Appeciate the help ahead of time.
The first thing you need to determine is how many unique attributes are in column B. Current versions of Excel have a limit of just over 16,000 columns.
I'll admit that your instructions about what you are trying to accomplish are kind of confusing to me. Based on what I read, you are trying to break out all the information in column B into their own unique columns based on what the value is that is returned? If that is the case, you may have to create a formula in each column. I am sure I am missig something key that you are trying to get across because my first thought was why you aren't putting that much data in a pivot table.
But from what I read, columns 'A', 'B', and 'C' are where your data exists and other columns are empty, so your VLOOKUP above won't work because the first argument is your lookup value, and you are pointing it toward an empty column. If you can take a few pieces of the data and take a screenshot of it in it's current format, and then take two or three rows, scrub them to show how you want it to look and take another screenshot, then post both pics, I could help you the right formulas or VLOOKUPS maybe. If the data came out of an EDW or something, post your SQL script and I can see if I can help there to maybe get your result without so much manual EXCEL work.
I fully admit though that it has been a long day so I am tired and what you typed may have made perfect sense. ;)
Try looking here http://office.microsoft.com/en-us/excel-help/how-to-look-up-a-value-in-a...
Uhg, using a spreadsheet as a database... using the right tool would help...
I could do it in an excel macro, but the programmer in me always goes for macros over excel functions. If you can't get vlookup to work I might be able to make a macro for you.
A pivot table, as suggested by Fade, would likely be the easiest way to format the data you want though I'm not sure if the table can be exported into a .csv file in the manner it seems you're looking for.
Don't use the vlookup() for matching two criteria. You should be able to do the same thing with nested index(match()) formula.
My understanding is that your original data is in three columns (part name, duplicated; series of property names, duplicated for each part; data, unique to each part name & property name), and that the table you've created has the part name (unique values only) in column A, and the property names in row 1, and you want to create a table pulling the values from column C in the original dataset to populate interected cells in the new dataset.
try the following (dry-coded, you will likely need to play with the cell references)
=index(sheet1!$a$1:$c$600000,match($A2&B$1,sheet1!$A$1:$A$60000&$B$1:$B$600000,0),3)
This should (I believe) pull the value from column C that corresponds to the value of the part name in the first column on the second sheet (you should put your new dataset in a new sheet) and the property name in row 1
Copy the formula to all the cells you want data in, then select the range of data, press F2 to open one of the cells for data entry and Ctrl+Shift+Enter to enter all the cells as an array formula.
Without a select of the dataset, it's the best I can do unfortunately.
If it works, select all the cells in the new worksheet, copy and paste special to paste values and overwrite the formulas because that many index formuli will seriously bog shit down
Edit:
I figured I'd get around to testing this out and seeing if it would work. It does and the formula I used for my test data was {=INDEX(Sheet1!$A$1:$C$15,MATCH($A2&B$1,Sheet1!$A$1:$A$15&Sheet1!$B$1:$B$15,),3)}. Enter it into one cell as an array, then copy right and down for the rest of the range. Obviously, the ranges and references would need to be changed. I tried to attach a copy of the file, but it seems we can't, which is probably for the best considering some of the people here. (I'm looking at you, denizens of site chat and links to lemonparty.org as a "halo stat site, check it out!" You know who you are)