Home » Download Datasets
    Below are the links to download the processed and normalized gene expression datasets that are used in the SEEK human compendium. These datasets are organized by platform.

    This data collection has been uniformly processed according to platform, and as such, it can be useful for doing coexpression, differential expression, and other downstream analyses both within and across datasets.

  • GPL570.tar.gz 3.6G, 1960 datasets
    [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array
  • GPL96.tar.gz 814M, 516 datasets
    [HG-U133A] Affymetrix Human Genome U133A Array
  • GPL6244.tar.gz 622M, 394 datasets
    [HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version]
  • GPL5175.tar.gz 177M, 120 datasets
    [HuEx-1_0-st] Affymetrix Human Exon 1.0 ST Array [transcript (gene) version]
  • GPL571.tar.gz 276M, 269 datasets
    [HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array
  • GPL8300.tar.gz 66M, 96 datasets
    [HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array
  • GPL1708.tar.gz 80M, 69 datasets
    Agilent-012391 Whole Human Genome Oligo Microarray G4112A (Feature Number version)
  • GPL4133.tar.gz 397M, 331 datasets
    Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Feature Number version)
  • GPL6480.tar.gz 352M, 310 datasets
    Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Probe Name version)
  • GPL6884.tar.gz 243M, 107 datasets
    Illumina HumanWG-6 v3.0 expression beadchip
  • GPL6947.tar.gz 383M, 180 datasets
    Illumina HumanHT-12 V3.0 expression beadchip
  • TCGA.RNASeq.tar.gz 279M, 224 datasets
    TCGA RNASeq V2 collection

    How to view the *.bin files in the tar ball?

    Each .bin file is an expression matrix in the binary format. To view its content, convert it to the text format using the PCL2Bin tool available in the Sleipnir toolkit. (You would need to first install Sleipnir if you do not already have it in your system. PCL2Bin is one of the tools in the toolkit. Alternatively, you can use a pre-compiled executable of PCL2Bin available here.)

    Next enter the following:

    PCL2Bin -i GSE10107.GPL4133.pcl.bin -o GSE10107.GPL4133.pcl

    This is an example script. -i is the input file. -o is the output text file. In the output file, the columns are samples; the rows are genes in ENTREZ gene IDs; the values are log-normalized.