The concept of Big Data is rapidly becoming popular in a variety of domains. The main reason of this paper here was to summarize the Impact, challenges for collecting and privacy of Big Data in health care. Big Data in health care has its own elements, such as heterogeneity, incompleteness, timeliness and longevity, privacy, and ownership. These features bring a series of challenges for data storage, collection, and sharing to promote health related research. From the perspective of a patient application of Big Data analysis could bring about improved treatment and lower costs.
Intro
Big Data, the generic term for data sets of structured and unstructured data that are extremely large and sophisticated that the standard software, algorithm, and data repositories are inadequate to gather, process, analyze, and store them has become an intensively studied area in recent years. With the advancement of Internet, social media, biology, finance, and digital medicine, the amount of data has increased dramatically. Big Data not only shows the massive size of knowledge as its name explains but also implies the flexibility of processing data at high speeds and novel technology and approaches for handling the information. Within the 21st century, Big Data went through a series of evolutionary steps, and software in suitable environment has been developed. With the rapid expansion of data exchanges, Big Data has grown to a particular scale, not only in its size but also in its technology. Currently, the infrastructure for large Data includes servers, storage systems, cloud service, and networking equipment. The advanced analytical technologies developed for large Data have driven its applications in many areas like health care, crime combat, finance, genomics, complex physics simulations, biology, and environmental research. Health care data are one among the driving forces of massive Data. it's shown a prominent increase within the volume of data. For instance, one single genome in human DNA occupies 100–150 GB and in terms of information size, Big Data in health care exceeded 150 exabytes after 2011, and a study showed that data size in health care is around 40 Zettabytes in 2020, about 50 times the 2009 amount of knowledge which is just 0.8 Zettabytes.

Challenges for Collecting Big Data in health care
As mentioned within the introduction Big Data in health care contains an outsized amount of unstructured data like natural language or other handwritten data whose integration, analysis, and storage bring a specific degree of difficulty. Determining a way to effectively collect an outsized amount of unstructured data will still be a serious challenge. one of the characteristics of big Data is variation in data sources, and medical data itself have a robust timeliness. The medical industry’s processing speed of information is extremely demanding, especially when the patient’s conditions might deteriorate quickly. additionally, when using real-time applications like cloud computing to access and analyze data, the patient data’s privacy and security also are a challenge. Cloud computing now offers new possibilities for medical Big Data’s collection and sharing. However, there are some challenges to beat before cloud computing can become more practical. In medicine, large amounts data are often required to be imported or exported to the cloud. The network bandwidth constraints affect the speed of data transmitted and also increase the value of cloud computing. Now, Big Data is being focused mainly on its accuracy; timely and accurate data collection which is another challenge, which remains within the initial stage.

Big Data in Medical Experiments
This part of Big Data mainly focuses on molecular biology, clinical trials, biology samples, gene sequences, and clinical and medical research laboratory tests and omics data (explained later). Molecular biology, is a significant part of both biological and medical experiments, focuses on interaction and regulation of biological activities within cells, like interactions between DNA, RNA, and proteins and biosynthesis. It's a detailed relationship with fields of biochemistry and genetics in research of proteins and genes. The main techniques of molecular biology include molecular cloning, polymerase chain reaction, macromolecule blotting and probing. Summary of Major Data Types of Big Data in Health Care samples of cells, tissues, and organs in human body, as well as cross-sectional photographs of the human body in the visible human project, which is used to visualize anatomy of human body in support of medical activities. In case of one type of new drug, vaccines, or new medical device has been created, clinical trials should be initialized before they come into use. Clinical trial, a kind of experiment or observation in medical or clinical research, is a procedure of evaluating the effectiveness of new medical treatment through carrying it out on human volunteers and analyzing and studying results. Gene sequencing, mainly refers to DNA sequencing, is a medical research activity of obtaining precise order of nucleotides within DNA. This process ends up in an oversized amount of data for recording DNA sequences. Omics data are biological information sets of data in the molecular level catalog, which include genom-ics, prote-omics, metabol-omics, transcript-omics, epigen-omics, lipid-omics, immune-omics, glyc-omics, and RN-omics.
Data Privacy
Health care data are more sensitive and centralized than other kinds of Big Data. There are significant concerns regarding confidentiality. However, for the matter of patient data privacy protection, no perfect solution has yet emerged. Patient data leakage may have unpredictable consequences. There are many real cases reception and abroad. Big Data technology makes personal medical data face a greater risk. Some people even believe that within the era of Big Data, protecting personal privacy is impossible. the matter is often alleviated by special processing (such as de-identification and digital identity encryption), but the identification and de-identification of data still require people or applications to process identifiable information that will cause the patient’s health information to be misappropriated by others without knowing or unauthorizedly. Big Data increases the risks to patient data for 2 reasons. First is that the risk of data itself. The Data will be copied and preserved without space and time constraints, and this feature is characterized by high risk and long-term risk under Big Data conditions. Second is that the risk Big Data technology. Under Big Data technology conditions, whether or not a big Database uses anonymous personal encrypted data, there's still a user identity that may be re-identified by residual risk, and private identities are often re-determined by data-link technology because Big Data uses pseudonymized personal confidential data that are anonymized but retain a residual risk of re-identification. This risk is bigger when different data are accustomed relate. De-anonymization is an attack within which anonymous data and other sources data are compared so as to re-identify the anonymous data sources.

Conclusion
Medical research that integrates Big Data will contribute to the next level of human health at a broader and deeper level. This paper summarizes and introduces the related research of medical data in the recent years. This paper mainly introduced the Impact, challenges for collecting and privacy of Big Data in health care. additionally, we summarize and give some thought to the opportunities and challenges within the study big medical data. In general, this research on medical data isn't yet mature; there are many problems that require to be resolved. so as to require full advantage of the profound patterns contained within the massive data, Big Data storage, collection, analysis, and related talent are essential. These technologies and abilities will support research on health care Big Data and further serve a good range of medical applications like public health, medical care, and medical insurance, and lots of others.
References :
https://en.wikipedia.org/wiki/Big_data
https://pubmed.ncbi.nlm.nih.gov/28679881/
https://itrexgroup.com/blog/big-data-in-healthcare-examples-problems-benefits/
https://www.snia.org/education/what-is-data-privacy
https://www.techtarget.com/searchcio/definition/data-privacy-information-privacy