Prepare Hi-C data 02 (intact Hi-C)

Download the data

set environment

Code
source ../run_config_project.sh
show_env
You are working on             Duke Server: HARDAC
BASE DIRECTORY (FD_BASE):      /data/reddylab/Kuei
REPO DIRECTORY (FD_REPO):      /data/reddylab/Kuei/repo
WORK DIRECTORY (FD_WORK):      /data/reddylab/Kuei/work
DATA DIRECTORY (FD_DATA):      /data/reddylab/Kuei/data
CONTAINER DIR. (FD_SING):      /data/reddylab/Kuei/container

You are working with           ENCODE FCC
PATH OF PROJECT (FD_PRJ):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC
PROJECT RESULTS (FD_RES):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results
PROJECT SCRIPTS (FD_EXE):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/scripts
PROJECT DATA    (FD_DAT):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data
PROJECT NOTE    (FD_NBK):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/notebooks
PROJECT DOCS    (FD_DOC):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/docs
PROJECT LOG     (FD_LOG):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/log
PROJECT APP     (FD_APP):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app
PROJECT REF     (FD_REF):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/references
PROJECT IMAGE   (FP_PRJ_SIF):  /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app/singularity_proj_encode_fcc.sif
Code
TXT_FOLDER=hic_intact_K562_ENCSR479XDG

Run the download script

Run the script that was generated in the previous step

Execute

Code
FD_OUT=${FD_DAT}/external/${TXT_FOLDER}

cd ${FD_OUT}
chmod +x ./run_download.sh

./run_download.sh

Review

Check output files

Code
ls ${FD_OUT} | xargs -n 1 basename
K562.hg38.ENCSR479XDG.ENCFF126GED.hic_intact.contact_domain.bedpe.gz
K562.hg38.ENCSR479XDG.ENCFF256ZMD.hic_intact.loops.bedpe.gz
K562.hg38.ENCSR479XDG.ENCFF621AIY.hic_intact.matrix.hic
run_download.log.txt
run_download.sh
Code
ls -lh ${FD_OUT}
total 32G
-rw-rw-r-- 1 kk319 reddylab 219K Nov 13  2022 K562.hg38.ENCSR479XDG.ENCFF126GED.hic_intact.contact_domain.bedpe.gz
-rw-rw-r-- 1 kk319 reddylab 2.8M Nov 13  2022 K562.hg38.ENCSR479XDG.ENCFF256ZMD.hic_intact.loops.bedpe.gz
-rw-rw-r-- 1 kk319 reddylab  32G Nov 10  2022 K562.hg38.ENCSR479XDG.ENCFF621AIY.hic_intact.matrix.hic
-rw-rw-r-- 1 kk319 reddylab  51M May  2 17:45 run_download.log.txt
-rwxrwxr-x 1 kk319 reddylab  603 May  2 17:27 run_download.sh
Code
FP_OUT=${FD_OUT}/K562.hg38.ENCSR479XDG.ENCFF256ZMD.hic_intact.loops.bedpe.gz
zcat ${FP_OUT} | head -n 5
#chr1   x1  x2  chr2    y1  y2  name    score   strand1 strand2 color   observed    expectedBL  expectedDonut   expectedH   expectedV   fdrBL   fdrDonut    fdrH    fdrV    numCollapsed    centroid1   centroid2   radius  highRes_start_1 highRes_end_1   highRes_start_2 highRes_end_2   localX  localY  localObserved   localPval   localPeakID
# juicer_tools version 2.13.07
chr10   102835000   102836000   chr10   102901000   102902000   .   .   .   .   0,255,255   16.0    2.5453029   2.0566912   2.896359    2.6027875   0.0 0.0 5.9604645E-8    0.0 2   102835000   102901500   500 102834600   102835200   102901400   102901700   102834700   102901500   4.0 2.17173181723318E-4 0
chr10   123583000   123584000   chr10   123967000   123968000   .   .   .   .   0,255,255   17.0    1.2294405   1.126373    1.5320965   2.968846    0.0 0.0 0.0 0.0 2   123583000   123967500   500 NA  NA  NA  NA  NA  NA  NA  NA  NA
chr10   60780000    60782000    chr10   60828000    60830000    .   .   .   .   0,255,255   16.0    3.9354546   3.6036625   4.087633    2.7198699   3.993511E-6 1.3113022E-6    6.377697E-6 5.9604645E-8    1   60781000    60829000    0   NA  NA  NA  NA  NA  NA  NA  NA  NA
Code
FP_OUT=${FD_OUT}/K562.hg38.ENCSR479XDG.ENCFF126GED.hic_intact.contact_domain.bedpe.gz
zcat ${FP_OUT} | head -n 5
#chr1   x1  x2  chr2    y1  y2  name    score   strand1 strand2 color   score   uVarScore   lVarScore   upSign  loSign
# juicer_tools version 2.13.06
chr10   89790000    90730000    chr10   89790000    90730000    .   .   .   .   255,255,0   0.48733428805910184 0.43218752937273686 0.39101424267719964 0.4366181410974244  0.5132138857782754
chr10   117485000   118260000   chr10   117485000   118260000   .   .   .   .   255,255,0   0.47145995448002187 0.40388060738386966 0.40280872469677464 0.44690992767915844 0.4451019066403682
chr10   51125000    52215000    chr10   51125000    52215000    .   .   .   .   255,255,0   0.575707559391474   0.38559880334456653 0.3954850048221743  0.47122602168473726 0.40008340283569643

Check execution log

Code
head -n 10 ${FD_OUT}/run_download.log.txt
--2024-05-02 17:27:59--  https://www.encodeproject.org/files/ENCFF621AIY/@@download/ENCFF621AIY.hic
Resolving www.encodeproject.org (www.encodeproject.org)... 34.211.244.144
Connecting to www.encodeproject.org (www.encodeproject.org)|34.211.244.144|:443... connected.
HTTP request sent, awaiting response... 307 Temporary Redirect
Location: https://encode-public.s3.amazonaws.com/2022/05/15/0571c671-3645-4f92-beae-51dfd3f42c36/ENCFF621AIY.hic?response-content-disposition=attachment%3B%20filename%3DENCFF621AIY.hic&AWSAccessKeyId=ASIATGZNGCNXWCWCL4OE&Signature=3NvvPpiax3yeE7%2FE4wWH5CuDsSQ%3D&x-amz-security-token=IQoJb3JpZ2luX2VjEA4aCXVzLXdlc3QtMiJGMEQCIBpPJDCDpCVE1M1vw%2FGls2BE%2BFDxQ8WFkP9EWSCkOqZlAiBr7OX9Lb4BuCeL2Rzr2VonmZTiZLTREFMrvdXxSr9pDiqzBQhmEAAaDDIyMDc0ODcxNDg2MyIMP%2BrJcOeQq0JafnWlKpAFa5f7%2BX5lEZqVdYhDBt4%2BQmSGhQM%2BgTu8OBtUQy4fxsxQoTXpQDoEuqCjA7q1YtE5kSsb2lgURelfleGCt6j7YkGpYEqdFdh1auYeOaQKX99wGw2Yl3iLxn5WoBlUtbior0Pv4iGZtiNtMbiNmlvPQ5ahn654t42LjUZacPOo7XqXfHUjFEpd3uT3TSW68qg8reylYFMIRPr%2BRX6yMe64cImfiyFwOhbqK%2FlqQZ1qKHvC8mO3K9YAIceWtICkdbSMBs9gx1wMVyEnhSH70YvWvqsEqE6lx745%2F5VF%2FpujjxSLqegZj3m1jA94SMQi0DOSjRuOzXB%2FeUqaRECZQaxowHsn%2B02oAic16iIPyhMSDccLHfNRBDKa7WslPNfVsg25LlOQPQUfScCm4aq30AKjrczwM6tS7bLOCqc1fnLEE2GbMmrInXMeqCImMcb9NtYpbgsGFOq2922RBmJigTKexalmSQiNqsdqKIV7zN0qsAirdbvYiYyEQcwmApPcJ34mvrV2%2BaBLIqmt2lhUpOYiAtmEwCHt3J%2B8gh%2FVBOW9BOZmXGB3YKP8iml%2BUZKU2rj7F2KD9FhTu9zYh4sqf%2B4P7lLzKPkphab9pFalxiAkQyEdM1mzf6NkwE4xOMLHctJwLt9zsbpZE3EVrljDYU543IP2CdmZkY%2Fc6V18MrhUjLQOf3JxJvARjebkgkeGSRtmM%2BcvUqMuAvf7xLTdJOCPssaEwSFHUnSOhHDLWYeFkeMrxY2cRyiNxdnSStzTCW8a10%2FHt7t6hLMufKmW0fQ5o40E4LQ21uJY%2FSHl4CnBi1gS%2BK6XAC9sBT20KvNrRj%2BiJ7%2FIIXvjMWrs5ayUzowirGyFBzdsGiaDmE%2FzPfuV7aowuIfQsQY6sgEGZKLIo2jznU836Bwg3N8jv5Z%2BGaogkHJzbLTTGJXb%2F47XekGCYLL%2B3U9vry7tzpCyr1KpwHeD4mzxb7IL0%2B8uFIknYXwHsOYn7uM5%2Fo5NiTgKIxrFUAHTmcMtCxaf6CYoCbkljK%2FryI1kEDU3HR6Au%2F3tQg0Y1Eh2wYbQVNkH9nHLByXatcmz%2F%2B%2FU3zmahD%2By63A15Kw1GrRc8ykF2Y9eLvJLxHYMag0SrLIQkYNNSBv%2F&Expires=1714814879 [following]
--2024-05-02 17:27:59--  https://encode-public.s3.amazonaws.com/2022/05/15/0571c671-3645-4f92-beae-51dfd3f42c36/ENCFF621AIY.hic?response-content-disposition=attachment%3B%20filename%3DENCFF621AIY.hic&AWSAccessKeyId=ASIATGZNGCNXWCWCL4OE&Signature=3NvvPpiax3yeE7%2FE4wWH5CuDsSQ%3D&x-amz-security-token=IQoJb3JpZ2luX2VjEA4aCXVzLXdlc3QtMiJGMEQCIBpPJDCDpCVE1M1vw%2FGls2BE%2BFDxQ8WFkP9EWSCkOqZlAiBr7OX9Lb4BuCeL2Rzr2VonmZTiZLTREFMrvdXxSr9pDiqzBQhmEAAaDDIyMDc0ODcxNDg2MyIMP%2BrJcOeQq0JafnWlKpAFa5f7%2BX5lEZqVdYhDBt4%2BQmSGhQM%2BgTu8OBtUQy4fxsxQoTXpQDoEuqCjA7q1YtE5kSsb2lgURelfleGCt6j7YkGpYEqdFdh1auYeOaQKX99wGw2Yl3iLxn5WoBlUtbior0Pv4iGZtiNtMbiNmlvPQ5ahn654t42LjUZacPOo7XqXfHUjFEpd3uT3TSW68qg8reylYFMIRPr%2BRX6yMe64cImfiyFwOhbqK%2FlqQZ1qKHvC8mO3K9YAIceWtICkdbSMBs9gx1wMVyEnhSH70YvWvqsEqE6lx745%2F5VF%2FpujjxSLqegZj3m1jA94SMQi0DOSjRuOzXB%2FeUqaRECZQaxowHsn%2B02oAic16iIPyhMSDccLHfNRBDKa7WslPNfVsg25LlOQPQUfScCm4aq30AKjrczwM6tS7bLOCqc1fnLEE2GbMmrInXMeqCImMcb9NtYpbgsGFOq2922RBmJigTKexalmSQiNqsdqKIV7zN0qsAirdbvYiYyEQcwmApPcJ34mvrV2%2BaBLIqmt2lhUpOYiAtmEwCHt3J%2B8gh%2FVBOW9BOZmXGB3YKP8iml%2BUZKU2rj7F2KD9FhTu9zYh4sqf%2B4P7lLzKPkphab9pFalxiAkQyEdM1mzf6NkwE4xOMLHctJwLt9zsbpZE3EVrljDYU543IP2CdmZkY%2Fc6V18MrhUjLQOf3JxJvARjebkgkeGSRtmM%2BcvUqMuAvf7xLTdJOCPssaEwSFHUnSOhHDLWYeFkeMrxY2cRyiNxdnSStzTCW8a10%2FHt7t6hLMufKmW0fQ5o40E4LQ21uJY%2FSHl4CnBi1gS%2BK6XAC9sBT20KvNrRj%2BiJ7%2FIIXvjMWrs5ayUzowirGyFBzdsGiaDmE%2FzPfuV7aowuIfQsQY6sgEGZKLIo2jznU836Bwg3N8jv5Z%2BGaogkHJzbLTTGJXb%2F47XekGCYLL%2B3U9vry7tzpCyr1KpwHeD4mzxb7IL0%2B8uFIknYXwHsOYn7uM5%2Fo5NiTgKIxrFUAHTmcMtCxaf6CYoCbkljK%2FryI1kEDU3HR6Au%2F3tQg0Y1Eh2wYbQVNkH9nHLByXatcmz%2F%2B%2FU3zmahD%2By63A15Kw1GrRc8ykF2Y9eLvJLxHYMag0SrLIQkYNNSBv%2F&Expires=1714814879
Resolving encode-public.s3.amazonaws.com (encode-public.s3.amazonaws.com)... 52.92.208.249, 52.92.225.193, 52.92.241.41, ...
Connecting to encode-public.s3.amazonaws.com (encode-public.s3.amazonaws.com)|52.92.208.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33783625697 (31G) [binary/octet-stream]