{"id":16001,"date":"2022-05-26T09:25:53","date_gmt":"2022-05-26T09:25:53","guid":{"rendered":"https:\/\/blog.datumo.com\/en\/?p=16001"},"modified":"2024-10-22T08:17:10","modified_gmt":"2024-10-22T08:17:10","slug":"korquad-dataset-2-0","status":"publish","type":"post","link":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001","title":{"rendered":"KorQuad Dataset 2.0"},"content":{"rendered":"[vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div class=\"pix-content-box card      vc_custom_1654579723245 custom-responsive-164930898   rounded-lg bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text]\r\n<p style=\"text-align: left;\"><span style=\"font-size: 14pt;\"><strong>\ud83d\udd11<\/strong> <strong>In 10 minutes you will learn:<\/strong><\/span><\/p>\r\n\r\n<ul class=\"p-rich_text_list p-rich_text_list__bullet\" data-stringify-type=\"unordered-list\" data-indent=\"0\" data-border=\"0\">\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">How Datumo collected and labeled text datasets<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">Sample data from each category<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">Process of data annotation and validation<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">Where to download the full dataset<\/li>\r\n<\/ul>\r\n[\/vc_column_text]<\/div><\/div><div id=\"el1646799961152-e3ee06c0-4e82\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][vc_section full_width=&#8221;stretch_row&#8221; pix_over_visibility=&#8221;&#8221; css=&#8221;.vc_custom_1650444445523{padding-top: 80px !important;padding-bottom: 80px !important;background-color: #f8f9fa !important;}&#8221; el_id=&#8221;pix_section_program&#8221;][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/4&#8243;]<div class=\"d-inline-block d-sm-flex w-100 text-center align-items-center   \"><div class=\"pix-md-circles d-inline-block2 d-inline-flex align-items-center align-self-center align-middle\"><span class=\"align-middle circle-item pix-mr-5 pix-bg-custom  shadow-sm \" data-anim-type=\"\" data-anim-delay=\"0\" data-toggle=\"tooltip\" data-placement=\"bottom\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/cropped-1.1-1.png\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/cropped-1.1-1-239x300.png 239w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/cropped-1.1-1.png 280w\" sizes=\"auto, (max-width: 280px) 100vw, 280px\" width=\"60\" height=\"60\" class=\"rounded-circle bg-white\" loading=\"lazy\" alt=\"\" \/><\/span><span class=\"align-middle circle-item pix-mr-5 pix-bg-custom  shadow-sm \" data-anim-type=\"\" data-anim-delay=\"100\" data-toggle=\"tooltip\" data-placement=\"bottom\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO.png\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO-300x300.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO-150x150.png 150w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO-400x400.png 400w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO-75x75.png 75w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/04\/LGCNSLOGO.png 421w\" sizes=\"auto, (max-width: 421px) 100vw, 421px\" width=\"60\" height=\"60\" class=\"rounded-circle bg-white\" loading=\"lazy\" alt=\"\" \/><\/span><\/div><\/div>[\/vc_column][vc_column width=&#8221;3\/4&#8243;]<div id=\"el1653580703805-67d0b5eb-a7dd\" class=\"mb-3 text-left \"><h2 class=\"mb-32 pix-sliding-headline font-weight-bold secondary-font\" data-class=\"secondary-font text-heading-default\" data-style=\"\">About the dataset<\/h2><\/div>[vc_column_text css=&#8221;.vc_custom_1653580733187{padding-bottom: 0px !important;}&#8221;]This Korean question and answer dataset for web documents MRC was created by LG CNS and Datumo. We created 80,000+ question and answer pairs based on Wikipedia which results in total of 100,000+ pairs of questions and answers, including those from KorQuAD 1.0.[\/vc_column_text][\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column][vc_raw_html]JTNDbWV0YSUyMGh0dHAtZXF1aXYlM0QlMjJyZWZyZXNoJTIyJTIwY29udGVudCUzRCUyMjAlM0IlMjB1cmwlM0RodHRwcyUzQSUyRiUyRmRhdHVtby5jb20lMkZlbiUyRmtvcnF1YWQtZGF0YXNldC0yLTAlMkYlMjIlM0U=[\/vc_raw_html]<div id=\"el1650294698986-a1b962b5-ef42\" class=\"w-100 d-block \"><\/div><div class=\"pix-content-box card      vc_custom_1654579738037 custom-responsive-38299804   rounded-lg shadow-sm bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text css=&#8221;.vc_custom_1654579748953{padding-top: 40px !important;padding-right: 16px !important;padding-bottom: 0px !important;padding-left: 16px !important;}&#8221;]<img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter size-full wp-image-16003\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/img.webp\" alt=\"\" width=\"893\" height=\"262\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/img.webp 893w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/img-300x88.webp 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/img-768x225.webp 768w\" sizes=\"(max-width: 893px) 100vw, 893px\" \/>\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nMachine Reading Comprehension(MRC) is a natural language processing project which requires the AI model to comprehend the given text and the question, and point out the answer within the text. This is the core technology of automated question answering technology. The Korean standard dataset for MRC is called KorQuAD 1.0 and this dataset can be used not only for training AI models, but also as an objective standard to assess the performance of different models.\r\n\r\nThe preexisting Korean dataset performed question answering for short articles such as those from Wikipedia or the news. However, most texts we face in the real world, such as the texts from the web, product manuals, contracts, charts, lists, and so on, are constructed with various types of sentences. The sentences vary in structure, length, and complexity and most of the times MRC is required to perform within the whole text, rather than a single paragraph. As such, there was a gap between the actual tasks that needed MRC and academic research which eventually led to a problem of not being applicable, despite the algorithm\u2019s academic performance.\r\n\r\nIn order to solve this problem, LG CNS created the KorQuAD 2.0 to enable machine reading comprehension for texts and documents with various types of sentences. Datumo collected training data of 80,000 question and answer pairs formed from approximately 50,000 Wikipedia texts. Additional to the 20,000 KorQuAD 1.0 dataset, Datumo and LG CNS created a total of 100,000 datasets.\r\n\r\nCredit: <a href=\"https:\/\/korquad.github.io\/dataset\/KorQuAD_2.0\/KorQuAD_2.0_paper.pdf\">https:\/\/korquad.github.io\/dataset\/KorQuAD_2.0\/KorQuAD_2.0_paper.pdf<\/a>[\/vc_column_text]<\/div><\/div><div id=\"el1653580524191-2d646ba3-81fd\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][\/vc_section][vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div id=\"el1650442607008-a85a832d-43f0\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Dataset Specification<\/h2><\/div><\/div><\/div><div id=\"el1650442651668-7359ff25-270a\" class=\"w-100 d-block \"><\/div><div class=\"pix-content-box card      vc_custom_1654579761567 custom-responsive-47628649   rounded-lg shadow-lg shadow-hover-lg bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Number of documents and questions<\/span><\/strong><\/h5>\r\n<table class=\" alignleft\" style=\"width: 100%; border-collapse: collapse;\">\r\n<tbody>\r\n<tr>\r\n<td style=\"width: 20.9242%; background-color: #fafafa;\"><\/td>\r\n<td style=\"width: 17.5046%; background-color: #fafafa;\">Training<\/td>\r\n<td style=\"width: 16.9501%; background-color: #fafafa;\">Validation<\/td>\r\n<td style=\"width: 21.2939%; background-color: #fafafa;\">Assessment<\/td>\r\n<td style=\"width: 54.88%; background-color: #fafafa;\"><strong>Total<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 20.9242%;\">Documents<\/td>\r\n<td style=\"width: 17.5046%;\">38,496<\/td>\r\n<td style=\"width: 16.9501%;\">4,736<\/td>\r\n<td style=\"width: 21.2939%;\">4,725<\/td>\r\n<td style=\"width: 54.88%;\"><strong>47,957<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 20.9242%;\">Questions<\/td>\r\n<td style=\"width: 17.5046%;\">83,486<\/td>\r\n<td style=\"width: 16.9501%;\">10,165<\/td>\r\n<td style=\"width: 21.2939%;\">9,309<\/td>\r\n<td style=\"width: 54.88%;\"><strong>102,960<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n&nbsp;\r\n\r\n<strong>Ratio of answer types<\/strong>\r\n\r\nCategorized according to the length of the answer\r\n\r\n&nbsp;\r\n<h5><strong>Ratio of answer types<\/strong><\/h5>\r\n<table style=\"border-collapse: collapse; width: 100%;\">\r\n<tbody>\r\n<tr>\r\n<td style=\"width: 33.3333%; background-color: #f0f0f0;\"><\/td>\r\n<td style=\"width: 33.3333%; background-color: #f0f0f0;\">Short<\/td>\r\n<td style=\"width: 33.3333%; background-color: #f0f0f0;\">Long<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%;\">Text<\/td>\r\n<td style=\"width: 33.3333%;\">Choose answer from paragraph<\/td>\r\n<td style=\"width: 33.3333%;\">Choose whole paragraph as an answer<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%;\">Table<\/td>\r\n<td style=\"width: 33.3333%;\">Choose answer from table<\/td>\r\n<td style=\"width: 33.3333%;\">Choose whole table as an answer<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%;\">List<\/td>\r\n<td style=\"width: 33.3333%;\">Choose answer from list<\/td>\r\n<td style=\"width: 33.3333%;\">Choose whole list as an answer<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n[\/vc_column_text]<\/div><\/div><div id=\"el1650294913061-211813f5-5f2d\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][vc_section full_width=&#8221;stretch_row&#8221; pix_over_visibility=&#8221;&#8221; css=&#8221;.vc_custom_1650444445523{padding-top: 80px !important;padding-bottom: 80px !important;background-color: #f8f9fa !important;}&#8221;][vc_row full_width=&#8221;stretch_row&#8221; pix_particles_check=&#8221;&#8221;][vc_column content_align=&#8221;text-center&#8221; offset=&#8221;vc_col-lg-offset-0 vc_col-lg-12 vc_col-md-offset-1 vc_col-md-10&#8243;][vc_column_text css=&#8221;.vc_custom_1653558652929{padding-top: 40px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>Long answer examples<\/strong><\/h4>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Subtitle repetition (38%)<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. How was the relationship between Oscar Peterson and Norman Granz formed?<\/p>\r\n<p style=\"text-align: left;\">Title. Oscar Peterson &#8211; #biography &#8211; #Norman Granz<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Subtitle variation (47%)<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. Does Lee have any brothers or sisters?<\/p>\r\n<p style=\"text-align: left;\">Title. Lee &#8211; #siblings<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Creation (15%)<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. What is the name of the law practiced in order to protect cultural assets?<\/p>\r\n<p style=\"text-align: left;\">Title. Volcanic Caves of the Upper Geomunoreum Lava Tube System &#8211; #limitedaccess<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\">*Long answers refer to the whole paragraph within the according title.<\/p>\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653558780316{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>Short answer examples<\/strong><\/h4>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Short Answers<\/strong><\/p>\r\n<p style=\"text-align: left;\">Phrase variation (48.0%)<\/p>\r\n<p style=\"text-align: left;\">Q. What year did Korea sell bottled water for foreign tourists?<\/p>\r\n<p style=\"text-align: left;\">\u201c\u2026 around 1988 Seoul Olympics, Korea allowed selling bottled water for foreign tourists, but was prohibited soon after. \u2026<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Word variation (15.4%)<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. What is the name of the field manager of Lotte Chiba who got fired during season in 2009?<\/p>\r\n<p style=\"text-align: left;\">\u201c\u2026As the news regarding the dismissal of field manager Bobby Valentine was published, some of the fans\u2026\u201d<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Sentence combinations (8.0%)<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. What is the name of the Korean mobile device manufacturer that used \u201cDon\u2019t Cha\u201d as their commercial music?<\/p>\r\n<p style=\"text-align: left;\">\u201c\u2026 their debut single <strong>\u201cDon\u2019t Cha\u201d<\/strong> ranked number one in the UK, Australia, Canada, etc\u2026 Also this song has been used as the commercial music for <strong>mobile devices of SKY, a Korean mobile device manufacturer<\/strong>\u2026<\/p>\r\n&nbsp;\r\n<p style=\"text-align: left;\"><strong>Chart\/ List<\/strong><\/p>\r\n<p style=\"text-align: left;\">Q. Which party is the second runner-up part of?<\/p>\r\n\r\n<ul>\r\n \t<li style=\"text-align: left;\">James Earl Carter Jr. (Walter Frederick Mondale) \/ Democratic \/ GA, MN<\/li>\r\n \t<li style=\"text-align: left;\">Gerald Rudolph Ford Jr. (Robert Joseph Dole) \/ Republican\/ MI, KS<\/li>\r\n \t<li style=\"text-align: left;\">Ronald Wilson Reagan (Robert Joseph Dole) \/ Republican\/ CA, KS<\/li>\r\n<\/ul>\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653581327260{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>Number of data<\/strong><\/h4>\r\n&nbsp;\r\n<p id=\"tw-target-text\" class=\"tw-data-text tw-text-large tw-ta\" dir=\"ltr\" style=\"text-align: left;\" data-placeholder=\"Translation\"><span class=\"Y2IQFc\" lang=\"en\">100,000 total<\/span><\/p>\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653581338167{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>Number of participants<\/strong><\/h4>\r\n&nbsp;\r\n<p style=\"text-align: left;\">1,372 people<\/p>\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653581345564{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>Project period<\/strong><\/h4>\r\n&nbsp;\r\n<p style=\"text-align: left;\">July to August 2019<\/p>\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653581471192{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-right: 20px !important;padding-left: 20px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h4 style=\"text-align: left;\"><strong>LG<\/strong> <strong>AI &amp; Big Data Research Center<\/strong><\/h4>\r\n&nbsp;\r\n<p style=\"text-align: left;\">Korean question-answer dataset<\/p>\r\n[\/vc_column_text][\/vc_column][\/vc_row][\/vc_section][vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div id=\"el1653584935457-1d6124ee-85fd\" class=\"w-100 d-block \"><\/div><div id=\"el1650362147064-486b7dc2-a9b3\" class=\"w-100 d-block \"><\/div><div class=\"pix-content-box card      vc_custom_1654579797451 custom-responsive-30201966   rounded-lg bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text css=&#8221;.vc_custom_1653559747914{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Impressive quality of various data<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\n\u201cWith Datumo, we were able to efficiently collect KorQuad 2.0, a Question-Answer dataset in Korean. We loved the quality and diversity of data, collected from broad workers. Especially, Datumo&#8217;s user guideline for our task was very impressive, capturing our expectations into clear explanations for the workers.\u201d\r\n\r\n<strong>AI &amp; Big Data Research Center<\/strong>[\/vc_column_text]<\/div><\/div><div class=\"pix-content-box card      vc_custom_1653584928690    rounded-lg shadow shadow-hover-lg bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\"><div class=\"row pix-levels mb-3 mb-sm-0 mx-0 \" style=\"border-bottom:0;\"><div class=\"col-xs-12 col-md-12 pix-levels-step complete px-0\">\n\t\t\t  <h5 class=\"text-center font-weight-bold text-heading-default pb-3\" style=\"\">Collect documents<\/h5>\n\t\t\t  <div class=\"position-relative w-100 text-center mb-3\">\n\t\t\t\t  <div class=\"progress bg-gray-2\" ><div class=\"progress-bar bg-primary\" ><\/div><\/div>\n\t\t\t\t  <div class=\"pix-leveles-dot-div\"><span class=\"pix-levels-dot bg-primary\" >\n\t\t\t\t\t\t\t\t\t\t  <span class=\"pix-levels-dot-inner bg-dark-opacity-3\"><\/span>\n\t\t\t\t\t\t\t\t\t  <\/span><\/div>\n\t\t\t  <\/div><p class=\"text-center pix-p-10 text-body-default\" >Documents from Wikipedia<\/p><\/div><div class=\"w-100\"><\/div><div class=\"col-xs-12 col-md-12 pix-levels-step complete px-0\">\n\t\t\t  <h5 class=\"text-center font-weight-bold text-heading-default pb-3\" style=\"\">Create question-answer pairs<\/h5>\n\t\t\t  <div class=\"position-relative w-100 text-center mb-3\">\n\t\t\t\t  <div class=\"progress bg-gray-2\" ><div class=\"progress-bar bg-primary\" ><\/div><\/div>\n\t\t\t\t  <div class=\"pix-leveles-dot-div\"><span class=\"pix-levels-dot bg-primary\" >\n\t\t\t\t\t\t\t\t\t\t  <span class=\"pix-levels-dot-inner bg-dark-opacity-3\"><\/span>\n\t\t\t\t\t\t\t\t\t  <\/span><\/div>\n\t\t\t  <\/div><p class=\"text-center pix-p-10 text-body-default\" >Large pool of crowd-workers from Cash Mission*<\/p><\/div><div class=\"w-100\"><\/div><div class=\"col-xs-12 col-md-12 pix-levels-step complete px-0\">\n\t\t\t  <h5 class=\"text-center font-weight-bold text-heading-default pb-3\" style=\"\">Validation<\/h5>\n\t\t\t  <div class=\"position-relative w-100 text-center mb-3\">\n\t\t\t\t  <div class=\"progress bg-gray-2\" ><div class=\"progress-bar bg-primary\" ><\/div><\/div>\n\t\t\t\t  <div class=\"pix-leveles-dot-div\"><span class=\"pix-levels-dot bg-primary\" >\n\t\t\t\t\t\t\t\t\t\t  <span class=\"pix-levels-dot-inner bg-dark-opacity-3\"><\/span>\n\t\t\t\t\t\t\t\t\t  <\/span><\/div>\n\t\t\t  <\/div><p class=\"text-center pix-p-10 text-body-default\" >Cross-inspection for every single data<\/p><\/div><div class=\"w-100\"><\/div><\/div><div id=\"el1653581699462-bd538435-8c21\" class=\"w-100 d-block \"><\/div>[vc_column_text]\r\n<p style=\"text-align: center;\">*: Datumo\u2019s own developed crowd-sourcing platform which consists of 190K+ crowd-workers.<\/p>\r\n[\/vc_column_text]<div id=\"el1653581728719-edf83f2e-b5e1\" class=\"w-100 d-block \"><\/div><\/div><\/div><div id=\"el1650450433074-0be5e40e-928e\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Process of annotation<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653584176535{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]<img decoding=\"async\" class=\"aligncenter size-full wp-image-16059\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-744.png\" alt=\"\" width=\"1404\" height=\"1016\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-744.png 1404w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-744-300x217.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-744-1024x741.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-744-768x556.png 768w\" sizes=\"(max-width: 1404px) 100vw, 1404px\" \/>\r\n\r\n<strong>KorQuAD 2.0<\/strong> has been created based on the data collected from Cash Mission. During this process, every crowd-worker was tested to validate their capabilities of forming appropriate MRC questions, before participation. Through our crowd-sourcing platform, we were able to have a variety of participants and ensure consistent data quality by utilizing our unique tutorial system**.\r\n\r\n**: Datumo\u2019s tutorial system consists of strict guidelines and tests in order to accurately assess the worker\u2019s understanding of the project.[\/vc_column_text]<div id=\"el1653581938160-3bc5a2a2-3f3a\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Collection of documents<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653581864323{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]<img decoding=\"async\" class=\"aligncenter size-full wp-image-16056\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u110b\u1171\u110f\u1175\u1111\u1175\u1103\u1175\u110b\u1161.jpeg\" alt=\"\" width=\"600\" height=\"350\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u110b\u1171\u110f\u1175\u1111\u1175\u1103\u1175\u110b\u1161.jpeg 600w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u110b\u1171\u110f\u1175\u1111\u1175\u1103\u1175\u110b\u1161-300x175.jpeg 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/>\r\n<ul>\r\n \t<li>Used Wikipedia to collect structured documents on various topics<\/li>\r\n \t<li>Selected the top 150,000 Wikipedia documents sorted by [Page view] from June 2016 to May 2019, to collect documents that people find interesting<\/li>\r\n \t<li style=\"text-align: left;\">Added additional 50,000 documents in order to cover a broader range of text domain<\/li>\r\n<\/ul>\r\n[\/vc_column_text]<div id=\"el1650362652282-42ee7789-aa09\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Formation of question-answer pairs<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653585143057{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-16066\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-1083-1.png\" alt=\"\" width=\"600\" height=\"718\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-1083-1.png 849w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-1083-1-251x300.png 251w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-1083-1-768x919.png 768w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/>\r\n<ul>\r\n \t<li style=\"text-align: left;\">Used Cash Mission, Datumo\u2019s own developed crowd-sourcing platform<\/li>\r\n \t<li style=\"text-align: left;\">Provided workers with parts of documents instead of the whole article, in order to prevent workers forming Q&amp;A pairs only from the beginning of from \u201ceasier\u201d sections (avoid data bias)<\/li>\r\n \t<li style=\"text-align: left;\">Created Q&amp;A pairs based on the length of the answers &#8211; Long\/Short<\/li>\r\n<\/ul>\r\n[\/vc_column_text]<div id=\"el1653559273094-79d020fe-47b3\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Validation<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653664440860{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16072\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644.png\" alt=\"\" width=\"1765\" height=\"792\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644.png 1765w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-300x135.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-1024x459.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-768x345.png 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-1536x689.png 1536w\" sizes=\"(max-width: 1765px) 100vw, 1765px\" \/>\r\n\r\n<strong>Data quality assurance<\/strong>\r\n<ul>\r\n \t<li>Cross-validation carried out by 2-3 validators, per data<\/li>\r\n \t<li>Disqualified validators with project success rate lower than 85%<\/li>\r\n<\/ul>\r\n[\/vc_column_text]<div id=\"el1653559523129-d36e426b-e9d3\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Specific guidelines to enhance data quality<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653584826017{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]\r\n\r\n<figure id=\"attachment_16063\" aria-describedby=\"caption-attachment-16063\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-16063\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/KorQuAD\u1100\u1161\u110b\u1175\u1103\u11732-1024x666-1.png\" alt=\"\" width=\"1024\" height=\"666\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/KorQuAD\u1100\u1161\u110b\u1175\u1103\u11732-1024x666-1.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/KorQuAD\u1100\u1161\u110b\u1175\u1103\u11732-1024x666-1-300x195.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/KorQuAD\u1100\u1161\u110b\u1175\u1103\u11732-1024x666-1-768x500.png 768w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-16063\" class=\"wp-caption-text\">Screenshot of the guideline on Cash Mission<\/figcaption><\/figure>\r\n\r\n&nbsp;\r\n\r\nEach and every participant was required to take a test to take part in the actual labeling project. Participants were required to check if the question provided as an example was correct or not, along with their reasons for choosing the answer.[\/vc_column_text]<div id=\"el1653559688556-d5891d9c-b9c9\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-heading-default font-weight-bold heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"\" data-anim-delay=\"0\">Data to solve pragmatic NLP problems<\/h2><\/div><\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653584486592{padding-top: 40px !important;padding-bottom: 0px !important;}&#8221;]\r\n\r\n<figure id=\"attachment_16062\" aria-describedby=\"caption-attachment-16062\" style=\"width: 768px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16062 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1173\u110f\u1173\u1105\u1175\u11ab\u1109\u1163\u11ba-2022-05-12-\u110b\u1169\u1112\u116e-6.01.45-768x853-1.jpg\" alt=\"\" width=\"768\" height=\"853\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1173\u110f\u1173\u1105\u1175\u11ab\u1109\u1163\u11ba-2022-05-12-\u110b\u1169\u1112\u116e-6.01.45-768x853-1.jpg 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1173\u110f\u1173\u1105\u1175\u11ab\u1109\u1163\u11ba-2022-05-12-\u110b\u1169\u1112\u116e-6.01.45-768x853-1-270x300.jpg 270w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><figcaption id=\"caption-attachment-16062\" class=\"wp-caption-text\">https:\/\/korquad.github.io\/<\/figcaption><\/figure>\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nKorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver. It has expanded the range of machine reading from simple sentences to long, complex ones, which solves some of the major issues in the NLP industry. Moreover, as fair evaluation continues, it is founding the basis for developing more effective NLP models.[\/vc_column_text]<div id=\"el1653559528137-ce17b0a7-dfc2\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653559762794{margin-top: 80px !important;border-top-width: 1px !important;padding-top: 80px !important;padding-bottom: 0px !important;border-top-color: rgba(0,0,0,0.2) !important;border-top-style: solid !important;}&#8221;]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">References<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\nhttps:\/\/korquad.github.io\/dataset\/KorQuAD_2.0\/KorQuAD_2.0_paper.pdf\r\n\r\nhttps:\/\/korquad.github.io\/[\/vc_column_text][\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/2&#8243;]<div id=\"el1646794934167-c0c94dd3-ea74\" class=\"w-100 d-block \"><\/div><div class=\" mb-3 mb-md-0 \"  ><div class=\"card w-100 h-100 bg-white  vc_custom_1652982865548  pix-hover-item rounded-10 position-relative overflow-hidden2 text-white tilt fancy_card\" ><div class=\"card-img-overlay overflow-visible d-inline-block w-100 pix-img-overlay pix-p-30 d-flex align-items-end text-left\"><div class=\"w-100 \"><h3 class=\"card-title  text-black font-weight-bold mb-0 animate-in\" style=\"\">See what we can do for you.<\/h3><p class=\"card-text pix-pt-10 text-black \" style=\"\">Build smarter AI with us.<\/p><div class=\"card-btn-div mt-4 d-inline-block w-100\"><a  href=\"https:\/\/datumo.com\" class=\"btn mb-2     text-white btn-black d-inline-block      btn-md\" target=\"_blank\" rel=\"noopener\"    ><span class=\"font-weight-bold \" >Learn More<\/span><\/a><\/div><\/div><\/div><\/div><\/div>[\/vc_column][vc_column width=&#8221;1\/2&#8243;]<div id=\"el1646794982519-9a19190b-7fde\" class=\"w-100 d-block \"><\/div><div class=\" mb-3 mb-md-0 \"  ><div class=\"card w-100 h-100 bg-black  vc_custom_1653559848133  pix-hover-item rounded-10 position-relative overflow-hidden2 text-white tilt fancy_card\" ><div class=\"card-img-overlay overflow-visible d-inline-block w-100 pix-img-overlay pix-p-30 d-flex align-items-end text-left\"><div class=\"w-100 \"><h3 class=\"card-title  text-white font-weight-bold mb-0 animate-in\" style=\"\">We would like to support the AI industry by sharing.<\/h3><p class=\"card-text pix-pt-10 text-white \" style=\"\"><\/p><div class=\"card-btn-div mt-4 d-inline-block w-100\"><a  href=\"https:\/\/open.datumo.com\/en\" class=\"btn mb-2    vc_custom_1653559848138  btn-primary d-inline-block      btn-md\" target=\"_blank\" rel=\"noopener\"    ><span class=\"font-weight-bold \" >Download Open Datasets<\/span><\/a><\/div><\/div><\/div><\/div><\/div>[\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div id=\"el1646799961152-e3ee06c0-4e82\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row]","protected":false},"excerpt":{"rendered":"[vc_row pix_particles_check=&#8221;&#8221;][vc_column][\/vc_column][\/vc_row][vc_section full_width=&#8221;stretch_row&#8221; pix_over_visibility=&#8221;&#8221; css=&#8221;.vc_custom_1650444445523{padding-top: 80px !important;padding-bottom: 80px !important;background-color: #f8f9fa !important;}&#8221; el_id=&#8221;pix_section_program&#8221;][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/4&#8243;][\/vc_column][vc_column width=&#8221;3\/4&#8243;][vc_column_text css=&#8221;.vc_custom_1653580733187{padding-bottom: 0px !important;}&#8221;]This Korean question and answer dataset for web documents MRC was created by LG CNS and Datumo. We created 80,000+ question and answer&#8230;","protected":false},"author":1,"featured_media":15825,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[26,143,127,142,144,145],"class_list":["post-16001","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-usecases_success","tag-ai","tag-dataset","tag-datumo","tag-korquad","tag-machine-reading-comprehension","tag-mrc"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>KorQuad Dataset 2.0 - DATUMO<\/title>\n<meta name=\"description\" content=\"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"KorQuad Dataset 2.0\" \/>\n<meta property=\"og:description\" content=\"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\" \/>\n<meta property=\"og:site_name\" content=\"DATUMO\" \/>\n<meta property=\"article:published_time\" content=\"2022-05-26T09:25:53+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-22T08:17:10+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"860\" \/>\n\t<meta property=\"og:image:height\" content=\"500\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"DATUMO\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"KorQuad Dataset 2.0\" \/>\n<meta name=\"twitter:description\" content=\"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"DATUMO\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"11\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"TechArticle\",\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#article\",\"isPartOf\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\"},\"author\":{\"name\":\"DATUMO\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6\"},\"headline\":\"KorQuad Dataset 2.0\",\"datePublished\":\"2022-05-26T09:25:53+00:00\",\"dateModified\":\"2024-10-22T08:17:10+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\"},\"wordCount\":2563,\"publisher\":{\"@id\":\"https:\/\/blog.datumo.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp\",\"keywords\":[\"AI\",\"Dataset\",\"datumo\",\"KorQuad\",\"Machine Reading Comprehension\",\"MRC\"],\"articleSection\":[\"use cases\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\",\"url\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\",\"name\":\"KorQuad Dataset 2.0 - DATUMO\",\"isPartOf\":{\"@id\":\"https:\/\/blog.datumo.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp\",\"datePublished\":\"2022-05-26T09:25:53+00:00\",\"dateModified\":\"2024-10-22T08:17:10+00:00\",\"description\":\"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.\",\"breadcrumb\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage\",\"url\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp\",\"contentUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp\",\"width\":860,\"height\":500},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/blog.datumo.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"KorQuad Dataset 2.0\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/blog.datumo.com\/#website\",\"url\":\"https:\/\/blog.datumo.com\/\",\"name\":\"DATUMO\",\"description\":\"The Data for Smarter AI\",\"publisher\":{\"@id\":\"https:\/\/blog.datumo.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/blog.datumo.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/blog.datumo.com\/#organization\",\"name\":\"DATUMO\",\"url\":\"https:\/\/blog.datumo.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp\",\"contentUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp\",\"width\":1080,\"height\":600,\"caption\":\"DATUMO\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6\",\"name\":\"DATUMO\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g\",\"caption\":\"DATUMO\"},\"description\":\"DATUMO, The Data for Smarter AI. We seek to drive impact in the world by providing diverse and high quality data to build smarter AI.\",\"sameAs\":[\"https:\/\/blog.datumo.com\/en\"],\"url\":\"https:\/\/blog.datumo.com\/en\/author\/selectstar\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"KorQuad Dataset 2.0 - DATUMO","description":"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001","og_locale":"ko_KR","og_type":"article","og_title":"KorQuad Dataset 2.0","og_description":"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.","og_url":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001","og_site_name":"DATUMO","article_published_time":"2022-05-26T09:25:53+00:00","article_modified_time":"2024-10-22T08:17:10+00:00","og_image":[{"width":860,"height":500,"url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp","type":"image\/webp"}],"author":"DATUMO","twitter_card":"summary_large_image","twitter_title":"KorQuad Dataset 2.0","twitter_description":"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.","twitter_misc":{"\uae00\uc4f4\uc774":"DATUMO","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"11\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"TechArticle","@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#article","isPartOf":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001"},"author":{"name":"DATUMO","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6"},"headline":"KorQuad Dataset 2.0","datePublished":"2022-05-26T09:25:53+00:00","dateModified":"2024-10-22T08:17:10+00:00","mainEntityOfPage":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001"},"wordCount":2563,"publisher":{"@id":"https:\/\/blog.datumo.com\/#organization"},"image":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage"},"thumbnailUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp","keywords":["AI","Dataset","datumo","KorQuad","Machine Reading Comprehension","MRC"],"articleSection":["use cases"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001","url":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001","name":"KorQuad Dataset 2.0 - DATUMO","isPartOf":{"@id":"https:\/\/blog.datumo.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage"},"image":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage"},"thumbnailUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp","datePublished":"2022-05-26T09:25:53+00:00","dateModified":"2024-10-22T08:17:10+00:00","description":"KorQuAD 2.0 is currently open to anyone who needs it and is being used to measure the AI model performances of major companies such as Samsung SDS, Kakao, and Naver.","breadcrumb":{"@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/blog.datumo.com\/en\/usecases_success\/16001"]}]},{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#primaryimage","url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp","contentUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/space_003.webp","width":860,"height":500},{"@type":"BreadcrumbList","@id":"https:\/\/blog.datumo.com\/en\/usecases_success\/16001#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/blog.datumo.com\/en\/"},{"@type":"ListItem","position":2,"name":"KorQuad Dataset 2.0"}]},{"@type":"WebSite","@id":"https:\/\/blog.datumo.com\/#website","url":"https:\/\/blog.datumo.com\/","name":"DATUMO","description":"The Data for Smarter AI","publisher":{"@id":"https:\/\/blog.datumo.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/blog.datumo.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/blog.datumo.com\/#organization","name":"DATUMO","url":"https:\/\/blog.datumo.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/","url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp","contentUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp","width":1080,"height":600,"caption":"DATUMO"},"image":{"@id":"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6","name":"DATUMO","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g","caption":"DATUMO"},"description":"DATUMO, The Data for Smarter AI. We seek to drive impact in the world by providing diverse and high quality data to build smarter AI.","sameAs":["https:\/\/blog.datumo.com\/en"],"url":"https:\/\/blog.datumo.com\/en\/author\/selectstar"}]}},"_links":{"self":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/16001","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/comments?post=16001"}],"version-history":[{"count":28,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/16001\/revisions"}],"predecessor-version":[{"id":16904,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/16001\/revisions\/16904"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/media\/15825"}],"wp:attachment":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/media?parent=16001"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/categories?post=16001"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/tags?post=16001"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}