Deleting multiple JMP data table columns
Delete Columns
The JMP documentation on scripting includes sparse information on how to use the 'Delete Columns' command effectively when working with large amounts of data. Here is what I have found to work well, especially in conjunction with the 'Join' command to join two tables that have a lot of similar (but not duplicate) data that you do not want to keep in the output table.
First, you will want to get all column names from your data table, and store it in a variable:
cc=dt << Get Column Names;
Here, the "dt" variable is the data table (hence the abbreviation "dt") that has already been loaded with a JMP data table type through a command earlier in your script. Now that we have all of the column names in a variable, we can iterate through the columns that we know we want to delete, and then reuse this iteration on subsequent tables with the same structure as well. It is easiest to do this with a For loop:
for(i=N Items(cc), i>=10, i--,
cc[i] << Set Name("tmp");
dt << Delete Columns("tmp");
);
This For loop utilizes several different JSL specific functions, so let's break down what is happening. The basic For loop structure is just like C or C++. The 'N Items(cc)' command is a JSL command to retrieve a count of the amount of items stored in a list, which in this case is the number of column names stored in the variable "cc". You will notice that unlike a standard For loop of incrementing "i" by one with "i++" I have chosen to decrement "i" by one on each loop. I chose this route in my code because I have found that new columns that I create in a JMP table generally end up at the end of the columns in a table, and I am oftentimes creating tables to calculate new values off of existing columns. Therefore, when I attempt to delete a table prior to those calculation tables, JMP will complain (or the JSL will simply bomb without warning without actually deleting the column). To avoid any problems with calculation columns, I have chose to start at the end of the table and work my way back as I delete columns, which is why I work backwards through my loop with the "i--" instruction.
So why can't I just begin deleting each column in the list "cc"? Why do I rename every single column I want to delete "tmp" with the 'Set Name("tmp")' directive? Well, this is an unfortunate but necessary workaround for how JSL handles messaging to columns. Since a column name can contain spaces and special characters, the 'Get Column Names' function will wrap the Name("") function around columns in the list with spaces or special characters. This behavior can create a list that has mixed column name formats. When you are working on large Stepwise and Least Square Regression platform outputs as I do, you will end up with a lot of mixed column name types because of JMP's default output styles for new column variables that it creates. So, in order to avoid any problems, it is easiest to just rename each column you are going to delete as "tmp" (if you don't already have a "tmp" column in your list), and then delete the newly named column "tmp".
This brings up another annoying thing about the 'Delete Columns' directive in JSL: it doesn't seem to like variables or pointers to column names. I have tried every possible way I have thought of to use command like this to delete, but none of them have worked:
dt << Delete Columns(cc[i]);
dt << Delete Columns({cc[i]});
dt << Set Selected(1);
dt << Delete Columns;
The last two commands should in theory work, according to the Scripting Guide with JMP 7, but I haven't found it to work at all.
Let me know if you found this helpful, or have any tips for improving on this strategy for deleting multiple, mixed name type columns in JMP because I would love to simplify my scripts further!




Comments
Deleting specific multiple columns
The bit of script you provided to delete multiple columns was very helpful! I have a similar problem that I can't figure out...
I want to delete specific columns that all start with the name "Column" . For some reason I get a ton of columns with blank data and I'd like to delete those. They are named "Column 330" - "Column 500" so I've been trying to do a pattern match where it deletes every column that has the name "Column" in it. But that didn't work and neither did this:
cc=dt << Get Column Names;
for(i=N Items(cc), i>=330, i--,
dt << Delete Columns("Column " + i);
);
Any other ideas??
Thanks!
You cannot concatenate within the Delete Columns function
The code you need looks like this:
cc = dt << Get Column Names; try( for(i=NItems(cc), i>=330, i--, cc[i] << Set Name("tmp"); dt << Delete Columns("tmp"); ), show(i,exception_msg); throw(); );The reason your code wasn't working was because you were trying to do some coding within the 'Delete Columns' function that just won't work. The 'Delete Columns' function just simply doesn't like any calls to other functions - in your case, the attempt to concatenate a string and a numeric variable - and must simply get only the column name passed into the 'Delete Columns' function in order to work properly.
deleting specific columns by column name
Is there a way to read the column names and based on those delete the columsn? That is actually what I want to do. I don't always know that I want to keep the first 330 columns... I had that in my code just to try out the Delete Columns function.
This is something else I've tried;
cc=dt << Get Column Names;
for(i=N Items(cc), i>=10, i--,
pattern = "Column " + patSpan("0123456789");
if(patMatch( cc[i], pattern), dt << Delete Columns(cc[i]), show(cc[i]));
);
So here I am trying to match the Column name using a pattern. But it never matches... I really just want to delete any column that starts with the word "Column"... they could appear anywhere throughout my data and there could be 1 - 100 of them... I don't always know.
I know you said that this
dt << Delete Columns(cc[i])
doesn't work, but I didn't know what else to put in my "if" statement and the pattern isn't matching so it isn't getting to that point anyway.
Thanks!
Get Name function
OK, well in that case, you are still going to need to rely on my code to actually delete the table, because the Delete Column() function is very picky. You have to provide it with ONLY a quoted actual name of a column. I haven't found any other way to do it.
However, in order to do what you are suggesting, I do believe you can use the Get Name() column function (because it isn't as picky as the Delete Column function for some odd reason) to locate the appropriate column using a For loop as you suggest doing to pattern match. So you could probably get away with something like this:
patMatch = Get Name("Column " + patSpan("0123456789")); Column(patMatch) = Set Name("tmp"); Delete Column("tmp") ;Then you can reset the name of that particular column to "tmp", delete column "tmp", and then move on to locating the next column. Let me know if that works.