Hi-C Data

Data description

K562 in-situ Hi-C

K562 intact Hi-C

K562 Deep intact Hi-C

intact Hi-C dataset spanning ~25 billion reads generated by the Aiden Lab derived from K562 cells.

  • Wiki page
    • https://www.synapse.org/#!Synapse:syn18134906/wiki/619632
  • Hi-C Matrix
    • https://s3.us-east-1.wasabisys.com/aiden-suhas/hic_files/FINAL_GRCh38_processing/K562/total/inter.hic
  • Current loop calls shared by Erez during last Friday meeting
    • https://www.dropbox.com/sh/13ihnis7exi9lrp/AADgaBk0RoNDRAC62F5QpPFDa/K562/3.15.23/localizedList_primary_10.bedpe?dl=0
  • Juicebox visualization
    • https://tinyurl.com/2psgwswh
  • Compare

Hi-C Matrix

The hi-c matrix are downloaded from ENCODE and the data is stored as .hic format.

  • K562 Deep intact hi-c
    • Juicebox: https://tinyurl.com/25p52jkq
  • Comparison of K562 in-situ (ENCSR545YBD), intact (ENCSR479XDG), and deep intact
    • Juicebox: https://tinyurl.com/24hrg3ye

Loop calls

HICCUP method (reference)

observed    Raw contact count supporting that bin-pair (i.e. the number of Hi-C read pairs falling in these two bins)
expectedBL  Expected contact count under the “baseline” model (distance-dependent background)
expectedDonut   Expected count in the “donut” neighborhood (the local average of the 8 bins surrounding the pair)
expectedH   Expected count in the “horizontal” region (same row, flanking columns)
expectedV   Expected count in the “vertical” region (same column, flanking rows)

The observsed over expectation score for each loop was calculated from the counts summarized in the loop file.

\(Log2OE = \frac{\text{Observed raw contact count}}{\text{Expected count in the surrounded bins}}\)

TAD calls

Arrowhead (reference)

square in the Hi-C matrix that Arrowhead calls a domain

Arrowhead domain score