Friday 12 September 2014

Pandas colum slicing formats "['name']" and "[['name']]" are different

Using slightly different slicing format "[]' and [[]] will give different objects. This will cause problems when you use the result to compare with other data.

print allele_data.head()  #data
print allele_data["start"].__class__ #format 1
print allele_data[["start"]].__class__ # format 2

In [157]:      chr      start  ref_score ref alt  ref_index ref_strand  alt_score
0   chr1  186214179   0.822386   C   T         10          -   0.768521  
1  chr20   49942978   0.959431   A   G          1          -   0.953408  
2   chr1  144989929   0.649916   A   G         11          -   0.666702  
3   chr4    8548970   0.803862   G   A         15          -   0.773032  
4   chr8  135550588   0.892755   C   T          7          +   0.843062 

In [159]: <class 'pandas.core.series.Series'>
In [161]: <class 'pandas.core.frame.DataFrame'>

No comments:

Post a Comment