Prepare Hi-C data 02 (in situ Hi-C)

Download the data

set environment

Code
source ../run_config_project.sh
show_env
You are working on             Duke Server: HARDAC
BASE DIRECTORY (FD_BASE):      /data/reddylab/Kuei
REPO DIRECTORY (FD_REPO):      /data/reddylab/Kuei/repo
WORK DIRECTORY (FD_WORK):      /data/reddylab/Kuei/work
DATA DIRECTORY (FD_DATA):      /data/reddylab/Kuei/data
CONTAINER DIR. (FD_SING):      /data/reddylab/Kuei/container

You are working with           ENCODE FCC
PATH OF PROJECT (FD_PRJ):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC
PROJECT RESULTS (FD_RES):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results
PROJECT SCRIPTS (FD_EXE):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/scripts
PROJECT DATA    (FD_DAT):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data
PROJECT NOTE    (FD_NBK):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/notebooks
PROJECT DOCS    (FD_DOC):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/docs
PROJECT LOG     (FD_LOG):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/log
PROJECT APP     (FD_APP):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app
PROJECT REF     (FD_REF):      /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/references
PROJECT IMAGE   (FP_PRJ_SIF):  /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app/singularity_proj_encode_fcc.sif
Code
TXT_FOLDER=hic_insitu_K562_ENCSR545YBD

Run the download script

Run the script that was generated in the previous step

Execute

Code
FD_OUT=${FD_DAT}/external/${TXT_FOLDER}

cd ${FD_OUT}
chmod +x ./run_download.sh

./run_download.sh

Review

Check output files

Code
ls ${FD_OUT} | xargs -n 1 basename
K562.hg38.ENCSR545YBD.ENCFF271SAF.hic_insitu.contact_domain.bedpe.gz
K562.hg38.ENCSR545YBD.ENCFF616PUW.hic_insitu.matrix.hic
K562.hg38.ENCSR545YBD.ENCFF693XIL.hic_insitu.loops.bedpe.gz
run_download.log.txt
run_download.sh
Code
ls -lh ${FD_OUT}
total 22G
-rw-rw-r-- 1 kk319 reddylab 293K Jan 24  2023 K562.hg38.ENCSR545YBD.ENCFF271SAF.hic_insitu.contact_domain.bedpe.gz
-rw-rw-r-- 1 kk319 reddylab  22G Jan 24  2023 K562.hg38.ENCSR545YBD.ENCFF616PUW.hic_insitu.matrix.hic
-rw-rw-r-- 1 kk319 reddylab 880K Jan 24  2023 K562.hg38.ENCSR545YBD.ENCFF693XIL.hic_insitu.loops.bedpe.gz
-rw-rw-r-- 1 kk319 reddylab  34M May  2 15:46 run_download.log.txt
-rwxrwxr-x 1 kk319 reddylab  603 May  2 15:32 run_download.sh
Code
FP_OUT=${FD_OUT}/K562.hg38.ENCSR545YBD.ENCFF693XIL.hic_insitu.loops.bedpe.gz
zcat ${FP_OUT} | head -n 5
#chr1   x1  x2  chr2    y1  y2  name    score   strand1 strand2 color   observed    expectedBL  expectedDonut   expectedH   expectedV   fdrBL   fdrDonut    fdrH    fdrV    numCollapsed    centroid1   centroid2   radius
# juicer_tools version 2.13.06
chr10   34010000    34020000    chr10   35050000    35060000    .   .   .   .   0,255,255   25.0    6.6749783   6.117001    6.028173    4.855595    0.0036679406    3.0621103E-4    3.4923825E-4    6.161639E-6 1   34015000    35055000    0
chr10   124000000   124005000   chr10   124380000   124385000   .   .   .   .   0,255,255   38.0    13.690522   14.195029   18.101181   18.336344   0.0022106771    0.0023932518    0.058508653 0.059743028 3   123997500   124387500   7071
chr10   133330000   133335000   chr10   133495000   133500000   .   .   .   .   0,255,255   39.0    14.323421   15.468441   18.856985   18.800962   0.0011134844    0.0011820481    0.037496306 0.038466662 2   133330000   133500000   3536
Code
FP_OUT=${FD_OUT}/K562.hg38.ENCSR545YBD.ENCFF271SAF.hic_insitu.contact_domain.bedpe.gz
zcat ${FP_OUT} | head -n 5
#chr1   x1  x2  chr2    y1  y2  name    score   strand1 strand2 color   score   uVarScore   lVarScore   upSign  loSign
# juicer_tools version 2.13.06
chr10   84565000    86085000    chr10   84565000    86085000    .   .   .   .   255,255,0   0.797322669093712   0.36036680286901346 0.34029928924052977 0.44465944272445823 0.547858617131063
chr10   50985000    52225000    chr10   50985000    52225000    .   .   .   .   255,255,0   1.0083018408826874  0.3171058429109649  0.34772601741376474 0.5433548387096774  0.4545161290322581
chr10   98415000    99300000    chr10   98415000    99300000    .   .   .   .   255,255,0   0.6516994003966435  0.2746901933484111  0.28586117952993195 0.4402221941674031  0.6153263476833734

Check execution log

Code
head -n 10 ${FD_OUT}/run_download.log.txt
--2024-05-02 15:34:35--  https://www.encodeproject.org/files/ENCFF616PUW/@@download/ENCFF616PUW.hic
Resolving www.encodeproject.org (www.encodeproject.org)... 34.211.244.144
Connecting to www.encodeproject.org (www.encodeproject.org)|34.211.244.144|:443... connected.
HTTP request sent, awaiting response... 307 Temporary Redirect
Location: https://encode-public.s3.amazonaws.com/2022/10/10/0332f11a-aa2a-4532-9615-5b42b55ea5c3/ENCFF616PUW.hic?response-content-disposition=attachment%3B%20filename%3DENCFF616PUW.hic&AWSAccessKeyId=ASIATGZNGCNXR3L4ZKMY&Signature=faTvELv2J0DHQgbiE3t72r9jI2E%3D&x-amz-security-token=IQoJb3JpZ2luX2VjEAsaCXVzLXdlc3QtMiJIMEYCIQDFn4xV49f4H2d%2BsUDx7CV4DLpzASOUFpb0ToTdr4dmcgIhAPgUjjG%2F81sRE%2BygD62Cpjd2AJGDQpqdu59K%2F1WwsEL2KrMFCGQQABoMMjIwNzQ4NzE0ODYzIgzuRYgAD%2BCPG4U7K3gqkAW3NuXXZB%2FcE5VngsuguJIdNdM%2Bm8bC4q5vmsDqbvBrxYd1sf9clmSoXjIPaY3cw1%2FqKSkHThMGEoGdOiA35kXR2N1pZ1ZmIdFULjMbH4vKTwRgDrIbyU%2BYX76bw5hUoPsKX9s6vwvlQXskTJZp7ktX2Fetd5Xemr1zO0vQV5loNL8nY9phweQJxuinCqAjEjS6XGDDDxZxiCWmsPhVCJeXLlQI%2FbCJtwtDimxeXet42qWnq5KMarBZ8A%2BX1QNws3A92E6fWS4keUdPWyrWbFPh21IoS8A%2BDvSCOkEARX4JxVqV6w91ZS2MFrj%2F3sMW7xAyFF%2BT%2B6x41VO%2Bz3JV2yrxTOS76qQ05XQrOcYnpfEmPavmSDc7hoo0DpLrXwSToybd0gqP24EklEneb3%2Bh0qDnlI96G8tWdCk%2B9dqpPgQlXoB0tH46S%2BtYVsYr2xQe0kPKpVDptu15RNwhddZ%2BM7jRFjMfB058cC0oZ8dbyckJc4gzFCFILQIt6vElrkkUJnu3I6JYXZg1zHG22khRZEHDqQNg2uA%2BfkScL2YnEECPb3gFNM4tdWnl6OYDJm85QFY9CRpdCLl%2F%2BXK3B7%2B4jKv1mk7Za3%2BJvMH84lXZhEc1ICQbN4JKUgjWMgOpWuQNO1uz%2BaUJHPnPbqDfXiFn4uQVPHV9N2BwpF2u58xbv30APeax0X0WBGMriBDlsxLmBMDVtqhA2NBynCmPHNG%2BBSfWuF3IPww6wLrh%2B3ce8zYjkfpn0g1o4RvDaCW2UXo2%2FhP4u5u8CQNVXYJ0BaWvovVSX4lvGpVxJT%2BUif5J%2Fue64Y%2FetkdINsOVk1XJtz0oIf038VJ4J58V6Tvw1gHbVtoic%2B4blZebW%2Ba5nhORLk26kjCWy8%2BxBjqwAV5rlnkaaUttgzPUZmBXtPCavUeUumz24ToeqAN68ioNUe04CFRA8UHOVPfvY3LgMmbaCklmvMntZuPay4rahvcyE4yYth19rvuKpKqPYySdAOaHYYfOzVCzrIH8Te84C6%2FeM7n8mBhkkzLLOVW2fo0aq4yqPG8nVAoaO%2F%2FdQ0%2Fs0WDQB%2FdUNZtJSEaT%2BHR2NRMHTa%2FJBnZqEpqMRH%2BHduJkVlI6skkS%2Bu%2BxCFiBczWg&Expires=1714808075 [following]
--2024-05-02 15:34:35--  https://encode-public.s3.amazonaws.com/2022/10/10/0332f11a-aa2a-4532-9615-5b42b55ea5c3/ENCFF616PUW.hic?response-content-disposition=attachment%3B%20filename%3DENCFF616PUW.hic&AWSAccessKeyId=ASIATGZNGCNXR3L4ZKMY&Signature=faTvELv2J0DHQgbiE3t72r9jI2E%3D&x-amz-security-token=IQoJb3JpZ2luX2VjEAsaCXVzLXdlc3QtMiJIMEYCIQDFn4xV49f4H2d%2BsUDx7CV4DLpzASOUFpb0ToTdr4dmcgIhAPgUjjG%2F81sRE%2BygD62Cpjd2AJGDQpqdu59K%2F1WwsEL2KrMFCGQQABoMMjIwNzQ4NzE0ODYzIgzuRYgAD%2BCPG4U7K3gqkAW3NuXXZB%2FcE5VngsuguJIdNdM%2Bm8bC4q5vmsDqbvBrxYd1sf9clmSoXjIPaY3cw1%2FqKSkHThMGEoGdOiA35kXR2N1pZ1ZmIdFULjMbH4vKTwRgDrIbyU%2BYX76bw5hUoPsKX9s6vwvlQXskTJZp7ktX2Fetd5Xemr1zO0vQV5loNL8nY9phweQJxuinCqAjEjS6XGDDDxZxiCWmsPhVCJeXLlQI%2FbCJtwtDimxeXet42qWnq5KMarBZ8A%2BX1QNws3A92E6fWS4keUdPWyrWbFPh21IoS8A%2BDvSCOkEARX4JxVqV6w91ZS2MFrj%2F3sMW7xAyFF%2BT%2B6x41VO%2Bz3JV2yrxTOS76qQ05XQrOcYnpfEmPavmSDc7hoo0DpLrXwSToybd0gqP24EklEneb3%2Bh0qDnlI96G8tWdCk%2B9dqpPgQlXoB0tH46S%2BtYVsYr2xQe0kPKpVDptu15RNwhddZ%2BM7jRFjMfB058cC0oZ8dbyckJc4gzFCFILQIt6vElrkkUJnu3I6JYXZg1zHG22khRZEHDqQNg2uA%2BfkScL2YnEECPb3gFNM4tdWnl6OYDJm85QFY9CRpdCLl%2F%2BXK3B7%2B4jKv1mk7Za3%2BJvMH84lXZhEc1ICQbN4JKUgjWMgOpWuQNO1uz%2BaUJHPnPbqDfXiFn4uQVPHV9N2BwpF2u58xbv30APeax0X0WBGMriBDlsxLmBMDVtqhA2NBynCmPHNG%2BBSfWuF3IPww6wLrh%2B3ce8zYjkfpn0g1o4RvDaCW2UXo2%2FhP4u5u8CQNVXYJ0BaWvovVSX4lvGpVxJT%2BUif5J%2Fue64Y%2FetkdINsOVk1XJtz0oIf038VJ4J58V6Tvw1gHbVtoic%2B4blZebW%2Ba5nhORLk26kjCWy8%2BxBjqwAV5rlnkaaUttgzPUZmBXtPCavUeUumz24ToeqAN68ioNUe04CFRA8UHOVPfvY3LgMmbaCklmvMntZuPay4rahvcyE4yYth19rvuKpKqPYySdAOaHYYfOzVCzrIH8Te84C6%2FeM7n8mBhkkzLLOVW2fo0aq4yqPG8nVAoaO%2F%2FdQ0%2Fs0WDQB%2FdUNZtJSEaT%2BHR2NRMHTa%2FJBnZqEpqMRH%2BHduJkVlI6skkS%2Bu%2BxCFiBczWg&Expires=1714808075
Resolving encode-public.s3.amazonaws.com (encode-public.s3.amazonaws.com)... 52.92.249.241, 52.92.241.145, 52.92.163.137, ...
Connecting to encode-public.s3.amazonaws.com (encode-public.s3.amazonaws.com)|52.92.249.241|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22638682080 (21G) [binary/octet-stream]