{"id":15858,"date":"2022-05-24T08:50:40","date_gmt":"2022-05-24T08:50:40","guid":{"rendered":"https:\/\/blog.datumo.com\/en\/?p=15858"},"modified":"2024-10-22T08:08:32","modified_gmt":"2024-10-22T08:08:32","slug":"how-do-you-make-quality-data","status":"publish","type":"post","link":"https:\/\/blog.datumo.com\/en\/tech\/15858","title":{"rendered":"Core values of DATUMO"},"content":{"rendered":"[vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div class=\"pix-content-box card      vc_custom_1654579082115 custom-responsive-128103663   rounded-lg bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text]\r\n<p style=\"text-align: left;\"><span style=\"font-size: 14pt;\"><strong>\ud83d\udd11<\/strong> <strong>In 10 minutes you will learn:<\/strong><\/span><\/p>\r\n\r\n<ul class=\"p-rich_text_list p-rich_text_list__bullet\" data-stringify-type=\"unordered-list\" data-indent=\"0\" data-border=\"0\">\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">The 4 core values of Datumo<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">How Datumo tries to maintain the core values<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">Different types of data validation and their downsides<\/li>\r\n \t<li data-stringify-indent=\"0\" data-stringify-border=\"0\">Datumo\u2019s unique algorithms\/systems to acquire quality data<\/li>\r\n<\/ul>\r\n[\/vc_column_text]<\/div><\/div><div id=\"el1646799961152-e3ee06c0-4e82\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/4&#8243;]<div class=\"d-inline-block d-sm-flex w-100 text-center align-items-center justify-content-center  \"><div class=\"pix-lg-circles d-inline-block2 d-inline-flex align-items-center align-self-center align-middle\"><span class=\"align-middle circle-item pix-mr-5 pix-bg-custom animate-in \" data-anim-type=\"fade-in-left\" data-anim-delay=\"0\" data-toggle=\"tooltip\" data-placement=\"bottom\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1109\u1166\u110b\u1167\u11b8\u1102\u1175\u11b8-460x460.png\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1109\u1166\u110b\u1167\u11b8\u1102\u1175\u11b8-300x245.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1109\u1166\u110b\u1167\u11b8\u1102\u1175\u11b8-768x628.png 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1109\u1166\u110b\u1167\u11b8\u1102\u1175\u11b8.png 839w\" sizes=\"auto, (max-width: 839px) 100vw, 839px\" width=\"60\" height=\"60\" class=\"rounded-circle bg-white\" loading=\"lazy\" alt=\"\" \/><\/span><\/div><\/div><div id=\"el1653979501901-c1f23b2b-5f57\" class=\"w-100 d-block \"><\/div>[\/vc_column][vc_column width=&#8221;3\/4&#8243;]<div class=\"pix-content-box card      vc_custom_1653979547370    rounded-lg shadow shadow-hover-sm bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text css=&#8221;.vc_custom_1653979473318{padding-top: 40px !important;padding-right: 20px !important;padding-bottom: px !important;padding-left: 20px !important;}&#8221;]AI is only as smart as the quality of the training data.\r\n\r\nThis article is based on a speech given by <strong>David Kim, founder of DATUMO<\/strong>.[\/vc_column_text]<\/div><\/div>[\/vc_column][\/vc_row][vc_section full_width=&#8221;stretch_row&#8221; pix_over_visibility=&#8221;&#8221; css=&#8221;.vc_custom_1650444445523{padding-top: 80px !important;padding-bottom: 80px !important;background-color: #f8f9fa !important;}&#8221; el_id=&#8221;pix_section_program&#8221;][vc_row full_width=&#8221;stretch_row&#8221; pix_particles_check=&#8221;&#8221;][vc_column content_align=&#8221;text-center&#8221; offset=&#8221;vc_col-lg-offset-0 vc_col-lg-12 vc_col-md-offset-1 vc_col-md-10&#8243;]<div id=\"el1650442503491-f5da6b2f-fa35\" class=\"mb-3 text-left \"><h2 class=\"mb-32 pix-sliding-headline font-weight-bold secondary-font\" data-class=\"secondary-font text-heading-default\" data-style=\"\">What is \u201cquality data\u201d? How do you make quality data?<\/h2><\/div>[vc_column_text css=&#8221;.vc_custom_1653382331843{border-bottom-width: 0px !important;padding-top: 40px !important;padding-bottom: 40px !important;}&#8221;]\r\n<p style=\"text-align: left;\">AI models achieve intelligence from the manually annotated data. We call the annotated data \u201ctraining data.\u201d<\/p>\r\n<p style=\"text-align: left;\"><img fetchpriority=\"high\" decoding=\"async\" class=\"size-full wp-image-15619 aligncenter\" src=\"http:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/samuele-errico-piccarini-MyjVReZ5GLQ-unsplash.jpg\" alt=\"\" width=\"1200\" height=\"323\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/samuele-errico-piccarini-MyjVReZ5GLQ-unsplash.jpg 1200w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/samuele-errico-piccarini-MyjVReZ5GLQ-unsplash-300x81.jpg 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/samuele-errico-piccarini-MyjVReZ5GLQ-unsplash-1024x276.jpg 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/samuele-errico-piccarini-MyjVReZ5GLQ-unsplash-768x207.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/p>\r\n<p style=\"text-align: left;\">AI models and training data are deeply implemented in our daily lives. For instance, automated vehicles are trained with manually annotated data to detect and recognize objects.<\/p>\r\n<p style=\"text-align: left;\">Now it\u2019s all about Data-centric AI.<\/p>\r\n[\/vc_column_text]<div id=\"el1651511598477-bd41a385-16ef\" class=\"mb-3 text-left  vc_custom_1653979561211\"><h5 class=\"mb-32 pix-sliding-headline secondary-font\" data-class=\"secondary-font text-heading-default\" data-style=\"\">\u201cWithout training data, there would be no AI and without quality data, there would be no highly-performing AI.\u201d<\/h5><\/div>[\/vc_column][\/vc_row][\/vc_section][vc_row pix_particles_check=&#8221;&#8221;][vc_column][vc_raw_html]JTNDbWV0YSUyMGh0dHAtZXF1aXYlM0QlMjJyZWZyZXNoJTIyJTIwY29udGVudCUzRCUyMjAlM0IlMjB1cmwlM0RodHRwcyUzQSUyRiUyRmRhdHVtby5jb20lMkZlbiUyRmhvdy1kby15b3UtbWFrZS1xdWFsaXR5LWRhdGElMkYlMjIlM0U=[\/vc_raw_html]<div id=\"el1651647769840-086928bd-d0a8\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653382435817{border-bottom-width: 0px !important;padding-top: 40px !important;padding-bottom: 40px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]\r\n<p style=\"text-align: left;\">The trend has changed from focusing on AI models to focusing on quality datasets.<\/p>\r\n<p style=\"text-align: left;\"><em>-From model-centric to Data-centric AI<\/em><\/p>\r\n&nbsp;\r\n\r\nWhat is \u201cquality data\u201d?\r\n\r\nDavid points out four main characteristics of quality data.[\/vc_column_text]<div id=\"el1647236763948-8fa7de4a-2cf8\" class=\"w-100 d-block \"><\/div><div id=\"el1649936132924-c2bb5423-2188\" class=\"mb-3 text-center  vc_custom_1651647811232\"><h3 class=\"mb-32 pix-sliding-headline font-weight-bold secondary-font display-4\" data-class=\"secondary-font text-body-default\" data-style=\"\">ACCURACY, CONSISTENCY, COVERAGE, BALANCE<\/h3><\/div><div id=\"el1651647820815-a04fe879-782f\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left  vc_custom_1651505584007\"><div><div class=\"slide-in-container\"><h2 class=\"text-black font-weight-bold animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">1. Accuracy<\/h2><\/div><\/div><\/div><div  class=\"pix-heading-el text-left  vc_custom_1653382455401\"><div><div class=\"slide-in-container\"><h3 class=\"text-black animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">Accuracy - Accurate data to fulfill the purpose<\/h3><\/div><\/div><\/div>[vc_column_text]<img decoding=\"async\" class=\"aligncenter wp-image-15992 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694.png\" alt=\"\" width=\"1800\" height=\"900\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694.png 1800w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694-300x150.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694-1024x512.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694-768x384.png 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2694-1536x768.png 1536w\" sizes=\"(max-width: 1800px) 100vw, 1800px\" \/>[\/vc_column_text]<div id=\"el1651498858979-efb0294d-3132\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653382539382{border-bottom-width: 0px !important;padding-bottom: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;border-bottom-style: solid !important;}&#8221;]As shown in the image above, data must be annotated accurately, as intended. The essence is to not have any data that violate the guideline. Since all data are annotated manually, it is important to educate the annotators and have strict validation system in order to achieve data accuracy.\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n<h3><strong>The key is to have guidelines with high readability.<\/strong><\/h3>\r\n&nbsp;\r\n\r\nSince the beginning of foundation, Datumo had a dedicated &lt;User Guidelines Team&gt; that solely focuses on writing easy and clear guidelines for the annotators.\r\n\r\n<img decoding=\"async\" class=\"alignleft size-full wp-image-15563\" src=\"http:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.jpg\" alt=\"\" width=\"947\" height=\"559\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.jpg 947w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3-300x177.jpg 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3-768x453.jpg 768w\" sizes=\"(max-width: 947px) 100vw, 947px\" \/>\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nEasy and clear guidelines enhance the work efficiency.[\/vc_column_text]<div class=\"pix-content-box card      vc_custom_1654579121867 custom-responsive-66650800   rounded-lg shadow bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text css=&#8221;.vc_custom_1653382630456{border-bottom-width: 0px !important;padding-top: 10px !important;padding-bottom: 4px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]\r\n<h5><strong>\u201cSet accurate guidelines\u00a0 \u2192\u00a0 Train crowd-workers\u00a0 \u2192\u00a0 Achieve high quality data\u201d\u00a0<\/strong>is an unchanging truth.<\/h5>\r\n[\/vc_column_text]<\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1653555277227{border-bottom-width: 0px !important;padding-bottom: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;border-bottom-style: solid !important;}&#8221;]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Mandatory exams in order to participate in data annotation<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\nOne of Datumo\u2019s methods for quality control is the exam system. All crowd-workers are required to pass complex exams based on guidelines in order to participate in data annotation.\r\n\r\n&nbsp;\r\n\r\n<figure id=\"attachment_15564\" aria-describedby=\"caption-attachment-15564\" style=\"width: 1224px\" class=\"wp-caption alignleft\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-15564 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.1-e1653555132700.png\" alt=\"\" width=\"1224\" height=\"559\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.1-e1653555132700.png 1224w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.1-e1653555132700-300x137.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.1-e1653555132700-1024x468.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/3.1-e1653555132700-768x351.png 768w\" sizes=\"(max-width: 1224px) 100vw, 1224px\" \/><figcaption id=\"caption-attachment-15564\" class=\"wp-caption-text\">Feature: Test before participation<\/figcaption><\/figure>\r\n\r\n[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653659399660{margin-bottom: 40px !important;border-bottom-width: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]\r\n<h5><\/h5>\r\n<h5><\/h5>\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Strict validation based on Datumo\u2019s experience &amp; algorithm<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\nExperienced crowd-workers and in-house inspectors cross-validate the annotated dataset to filter inaccurate data. Furthermore, Datumo is the <span class=\"notion-enable-hover\" data-token-index=\"1\" data-reactroot=\"\">only<\/span> platform in Korea to run strict validation based on the reliability of the inspectors.\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16072\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644.png\" alt=\"\" width=\"1765\" height=\"792\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644.png 1765w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-300x135.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-1024x459.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-768x345.png 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2644-1536x689.png 1536w\" sizes=\"(max-width: 1765px) 100vw, 1765px\" \/>[\/vc_column_text]<div class=\"pix-content-box card      vc_custom_1654579135379 custom-responsive-76944293   rounded-lg shadow bg- w-100  \"   ><div class=\"\" style=\"z-index:30;position:relative;\">[vc_column_text css=&#8221;.vc_custom_1653659199443{border-bottom-width: 1px !important;padding-bottom: 40px !important;border-bottom-color: rgba(0,0,0,0.2) !important;border-bottom-style: solid !important;}&#8221;]<strong>Single inspection<\/strong>\r\n\r\n\u2192 Highly dependent on the accuracy and attentiveness of the inspector[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653659221558{border-bottom-width: 1px !important;padding-bottom: 40px !important;border-bottom-color: rgba(0,0,0,0.2) !important;border-bottom-style: solid !important;}&#8221;]<strong>Cross-validation based on majority rule<\/strong>\r\n\r\n\u2192 Inspectors of low accuracy may outnumber those of high accuracy and thus, spoil data validation[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653659462389{border-bottom-width: 0px !important;padding-bottom: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]Data validation based on inspector reliability inference algorithm is more accurate than single inspection or cross-validation based on majority rule.[\/vc_column_text][vc_separator][vc_column_text css=&#8221;.vc_custom_1653659434318{border-bottom-width: 0px !important;padding-bottom: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]Datumo is the\u00a0<b data-stringify-type=\"bold\">only one<\/b>\u00a0to carry out validation based on inspector reliability.[\/vc_column_text]<\/div><\/div>[vc_column_text css=&#8221;.vc_custom_1651509922020{border-bottom-width: 1px !important;padding-bottom: 40px !important;border-bottom-color: rgba(0,0,0,0.2) !important;border-bottom-style: solid !important;}&#8221;][\/vc_column_text]<div id=\"el1651511031278-787ba7bd-70d1\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-black font-weight-bold animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">2. Consistency<\/h2><\/div><\/div><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h3 class=\"text-black animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">Consistency - Consistent labeling results<\/h3><\/div><\/div><\/div>[vc_column_text]\r\n\r\n<figure id=\"attachment_15991\" aria-describedby=\"caption-attachment-15991\" style=\"width: 1213px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-15991 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/asdasd.png\" alt=\"\" width=\"1213\" height=\"692\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/asdasd.png 1213w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/asdasd-300x171.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/asdasd-1024x584.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/asdasd-768x438.png 768w\" sizes=\"(max-width: 1213px) 100vw, 1213px\" \/><figcaption id=\"caption-attachment-15991\" class=\"wp-caption-text\">*Guidelines without clear conditions are difficult to achieve consistent labeling results because of the subjectivity of the worker.<\/figcaption><\/figure>\r\n\r\n[\/vc_column_text]<div id=\"el1651503341409-bd345760-4075\" class=\"w-100 d-block \"><\/div>[vc_column_text]\r\n<h5><strong>Objective labeling rules to maintain consistency<\/strong><\/h5>\r\n&nbsp;\r\n\r\nDatumo focuses on achieving consistent data by making sure the guidelines leave no room for ambiguity. For instance, when collecting or labeling dataset related to speech or emotion, we try to set standards in our guidelines so that everything falls under a certain category in order to avoid subjective interpretations depending on the annotator.[\/vc_column_text][vc_column_text]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">What technology does Datumo use in order to maintain consistency?<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15577 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u11732.jpg\" alt=\"\" width=\"900\" height=\"721\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u11732.jpg 900w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u11732-300x240.jpg 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u11732-768x615.jpg 768w\" sizes=\"(max-width: 900px) 100vw, 900px\" \/>\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nOne of the most common issues in bounding box datasets is different box sizes depending on the annotator. As shown in the image above, without a specific guideline, Box 1 and Box 2 could be either correct of wrong depending on the inspector\u2019s perspective. In order to avoid such situation, Datumo came up with a solution: an inner guiding box(patented). An inner guiding box, which is Datumo\u2019s own-developed UI system, provides standards for both drawing and validating bounding boxes around the target object.[\/vc_column_text]<div id=\"el1646123312074-f8415338-9df9\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div id=\"el1646121976901-2b290989-6d0c\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h2 class=\"text-black font-weight-bold animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">3. Coverage<\/h2><\/div><\/div><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h3 class=\"text-black animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">Provide AI with enough case studies based on a variety of data<\/h3><\/div><\/div><\/div><div id=\"el1651504638373-acda718f-0814\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653383087945{border-bottom-width: 0px !important;padding-bottom: 80px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Dataset that covers a variety of cases that AI would face<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\nIt is important to expand the \u201ccoverage\u201d of datasets by considering the environment of collecting data when planning the whole process.\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n<h5><strong><span class=\"notion-enable-hover\" data-token-index=\"0\" data-reactroot=\"\">Data collecting process based on collecting environment<\/span><\/strong><\/h5>\r\n&nbsp;\r\n\r\nThe essence of expanding coverage is to provide AI models with a variety of cases that AI could face in the real world.\r\n\r\nFor facial image datasets, Datumo collected 1,100 crowd-workers through our own developed crowd-sourcing platform Cash Mission and hypothesized 3 different lightings, 8 different situations, and 11 different angles, which resulted in 264 different cases.[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653383169742{border-bottom-width: 0px !important;padding-bottom: 0px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15586 size-full\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u1173\u1105\u1161\u110b\u1175\u11ab4.png\" alt=\"\" width=\"800\" height=\"641\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u1173\u1105\u1161\u110b\u1175\u11ab4.png 800w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u1173\u1105\u1161\u110b\u1175\u11ab4-300x240.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2020\/02\/\u1100\u1161\u110b\u1175\u1103\u1173\u1105\u1161\u110b\u1175\u11ab4-768x615.png 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>[\/vc_column_text]<div id=\"el1651506730406-ae95cdfd-cff6\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653383202428{border-bottom-width: 0px !important;padding-bottom: 80px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]\r\n<h5><strong>Redundant data filtering system to ensure data variety<\/strong><\/h5>\r\n&nbsp;\r\n\r\nSimilar, redundant data reduce the value of the whole dataset. Due to its quantity, it is almost impossible to manually filter such data. This is why we came up with our own data filtering system based on machine learning. As the only ones to implement such system, Datumo developed an AI model that automatically detects three image data that look similar to the newly submitted image. Then, the inspector manually validates whether the newly submitted image is redundant or not.[\/vc_column_text][vc_column_text css=&#8221;.vc_custom_1653554976343{border-bottom-width: 0px !important;padding-bottom: 80px !important;border-bottom-color: rgba(0,0,0,0.2) !important;}&#8221;]<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-15994\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726.png\" alt=\"\" width=\"1980\" height=\"1054\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726.png 1980w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726-300x160.png 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726-1024x545.png 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726-768x409.png 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/Frame-2726-1536x818.png 1536w\" sizes=\"(max-width: 1980px) 100vw, 1980px\" \/>[\/vc_column_text]<div id=\"el1651513049672-bd8fcdaa-ee7b\" class=\"w-100 d-block \"><\/div><div  class=\"pix-heading-el text-left  vc_custom_1651509228886\"><div><div class=\"slide-in-container\"><h2 class=\"text-black font-weight-bold animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">4. Balance<\/h2><\/div><\/div><\/div><div  class=\"pix-heading-el text-left \"><div><div class=\"slide-in-container\"><h3 class=\"text-black animate-in heading-text el-title_custom_color mb-12\" style=\"\" data-anim-type=\"fade-in\" data-anim-delay=\"0\">Unbiased dataset<\/h3><\/div><\/div><\/div><div id=\"el1653554985753-2d34ac12-9c52\" class=\"w-100 d-block \"><\/div>[vc_column_text css=&#8221;.vc_custom_1653982516078{margin-bottom: 80px !important;}&#8221;]Good dataset must be unbiased. It is important to have a well-balanced dataset, consisting of elements weighting similarly to each other.\r\n\r\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16246\" src=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1.jpg\" alt=\"\" width=\"1920\" height=\"1022\" srcset=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1.jpg 1920w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1-300x160.jpg 300w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1-1024x545.jpg 1024w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1-768x409.jpg 768w, https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/1233-1-1536x818.jpg 1536w\" sizes=\"(max-width: 1920px) 100vw, 1920px\" \/>[\/vc_column_text][vc_column_text]\r\n<h5><strong>Avoid bias by various classification<\/strong><\/h5>\r\n&nbsp;\r\n\r\nFor a text data collecting project regarding automobile issue reports, we wanted to avoid ending up with just a few most common issues reported repeatedly. Thus, we specified the types of issues into five different categories- inoperable\/visual\/auditory\/tactile\/olfactory &#8211; and made sure to collect similar amount of data for each category.[\/vc_column_text][vc_column_text]As a result, we were able to minimize data imbalance and provide our client with the variety of data they have asked for.\r\n\r\n&nbsp;\r\n<h5><strong>From Model-centric to Data-centric AI<\/strong><\/h5>\r\n&nbsp;\r\n\r\nBased on such effort and technology, Datumo strives to make \u201cgood data.\u201d\r\n\r\nData quality was the core element of the business since foundation. We will continue to optimize and improve our technology and system to construct a smoother flywheel. In Data-centric perspective, consistent and high-quality data are of greatest value.\r\n\r\nDatumo aims to drive impact in the AI industry by innovating the system of collecting AI training data, one of the worst bottlenecks in the industry, through revolutionary technology.[\/vc_column_text]<div id=\"el1646123312074-f8415338-9df9\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/2&#8243;]<div id=\"el1646794934167-c0c94dd3-ea74\" class=\"w-100 d-block \"><\/div><div class=\" mb-3 mb-md-0 \"  ><div class=\"card w-100 h-100 bg-white  vc_custom_1652982865548  pix-hover-item rounded-10 position-relative overflow-hidden2 text-white tilt fancy_card\" ><div class=\"card-img-overlay overflow-visible d-inline-block w-100 pix-img-overlay pix-p-30 d-flex align-items-end text-left\"><div class=\"w-100 \"><h3 class=\"card-title  text-black font-weight-bold mb-0 animate-in\" style=\"\">See what we can do for you.<\/h3><p class=\"card-text pix-pt-10 text-black \" style=\"\">Build smarter AI with us.<\/p><div class=\"card-btn-div mt-4 d-inline-block w-100\"><a  href=\"https:\/\/datumo.com\" class=\"btn mb-2     text-white btn-black d-inline-block      btn-md\" target=\"_blank\" rel=\"noopener\"    ><span class=\"font-weight-bold \" >Learn More<\/span><\/a><\/div><\/div><\/div><\/div><\/div>[\/vc_column][vc_column width=&#8221;1\/2&#8243;]<div id=\"el1646794982519-9a19190b-7fde\" class=\"w-100 d-block \"><\/div><div class=\" mb-3 mb-md-0 \"  ><div class=\"card w-100 h-100 bg-black  vc_custom_1653383774824  pix-hover-item rounded-10 position-relative overflow-hidden2 text-white tilt fancy_card\" ><div class=\"card-img-overlay overflow-visible d-inline-block w-100 pix-img-overlay pix-p-30 d-flex align-items-end text-left\"><div class=\"w-100 \"><h3 class=\"card-title  text-white font-weight-bold mb-0 animate-in\" style=\"\">We would like to support the AI industry by sharing.<\/h3><p class=\"card-text pix-pt-10 text-white \" style=\"\"><\/p><div class=\"card-btn-div mt-4 d-inline-block w-100\"><a  href=\"https:\/\/open.datumo.com\/en\" class=\"btn mb-2    vc_custom_1653383774827  btn-primary d-inline-block      btn-md\" target=\"_blank\" rel=\"noopener\"    ><span class=\"font-weight-bold \" >Download Open Datasets<\/span><\/a><\/div><\/div><\/div><\/div><\/div>[\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column]<div id=\"el1646799961152-e3ee06c0-4e82\" class=\"w-100 d-block \"><\/div>[\/vc_column][\/vc_row]","protected":false},"excerpt":{"rendered":"[vc_row pix_particles_check=&#8221;&#8221;][vc_column][\/vc_column][\/vc_row][vc_row pix_particles_check=&#8221;&#8221;][vc_column width=&#8221;1\/4&#8243;][\/vc_column][vc_column width=&#8221;3\/4&#8243;][\/vc_column][\/vc_row][vc_section full_width=&#8221;stretch_row&#8221; pix_over_visibility=&#8221;&#8221; css=&#8221;.vc_custom_1650444445523{padding-top: 80px !important;padding-bottom: 80px !important;background-color: #f8f9fa !important;}&#8221; el_id=&#8221;pix_section_program&#8221;][vc_row full_width=&#8221;stretch_row&#8221; pix_particles_check=&#8221;&#8221;][vc_column content_align=&#8221;text-center&#8221; offset=&#8221;vc_col-lg-offset-0 vc_col-lg-12 vc_col-md-offset-1 vc_col-md-10&#8243;][vc_column_text css=&#8221;.vc_custom_1653382331843{border-bottom-width: 0px !important;padding-top: 40px !important;padding-bottom: 40px !important;}&#8221;] AI models achieve intelligence from the manually annotated data. We call&#8230;","protected":false},"author":1,"featured_media":15718,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[131],"tags":[26,130,127,126],"class_list":["post-15858","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech","tag-ai","tag-datasets","tag-datumo","tag-opendatasets"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Core values of DATUMO - DATUMO<\/title>\n<meta name=\"description\" content=\"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/blog.datumo.com\/en\/tech\/15858\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Core values of DATUMO\" \/>\n<meta property=\"og:description\" content=\"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/blog.datumo.com\/en\/tech\/15858\" \/>\n<meta property=\"og:site_name\" content=\"DATUMO\" \/>\n<meta property=\"article:published_time\" content=\"2022-05-24T08:50:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-22T08:08:32+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DATUMO\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Core values of DATUMO\" \/>\n<meta name=\"twitter:description\" content=\"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"DATUMO\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"11\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"TechArticle\",\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#article\",\"isPartOf\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858\"},\"author\":{\"name\":\"DATUMO\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6\"},\"headline\":\"Core values of DATUMO\",\"datePublished\":\"2022-05-24T08:50:40+00:00\",\"dateModified\":\"2024-10-22T08:08:32+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858\"},\"wordCount\":2521,\"publisher\":{\"@id\":\"https:\/\/blog.datumo.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg\",\"keywords\":[\"AI\",\"datasets\",\"datumo\",\"opendatasets\"],\"articleSection\":[\"tech\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858\",\"url\":\"https:\/\/blog.datumo.com\/en\/tech\/15858\",\"name\":\"Core values of DATUMO - DATUMO\",\"isPartOf\":{\"@id\":\"https:\/\/blog.datumo.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg\",\"datePublished\":\"2022-05-24T08:50:40+00:00\",\"dateModified\":\"2024-10-22T08:08:32+00:00\",\"description\":\"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.\",\"breadcrumb\":{\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/blog.datumo.com\/en\/tech\/15858\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage\",\"url\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg\",\"contentUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg\",\"width\":1920,\"height\":1080},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/blog.datumo.com\/en\/tech\/15858#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/blog.datumo.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Core values of DATUMO\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/blog.datumo.com\/#website\",\"url\":\"https:\/\/blog.datumo.com\/\",\"name\":\"DATUMO\",\"description\":\"The Data for Smarter AI\",\"publisher\":{\"@id\":\"https:\/\/blog.datumo.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/blog.datumo.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/blog.datumo.com\/#organization\",\"name\":\"DATUMO\",\"url\":\"https:\/\/blog.datumo.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp\",\"contentUrl\":\"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp\",\"width\":1080,\"height\":600,\"caption\":\"DATUMO\"},\"image\":{\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6\",\"name\":\"DATUMO\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/blog.datumo.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g\",\"caption\":\"DATUMO\"},\"description\":\"DATUMO, The Data for Smarter AI. We seek to drive impact in the world by providing diverse and high quality data to build smarter AI.\",\"sameAs\":[\"https:\/\/blog.datumo.com\/en\"],\"url\":\"https:\/\/blog.datumo.com\/en\/author\/selectstar\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Core values of DATUMO - DATUMO","description":"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/blog.datumo.com\/en\/tech\/15858","og_locale":"ko_KR","og_type":"article","og_title":"Core values of DATUMO","og_description":"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.","og_url":"https:\/\/blog.datumo.com\/en\/tech\/15858","og_site_name":"DATUMO","article_published_time":"2022-05-24T08:50:40+00:00","article_modified_time":"2024-10-22T08:08:32+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg","type":"image\/jpeg"}],"author":"DATUMO","twitter_card":"summary_large_image","twitter_title":"Core values of DATUMO","twitter_description":"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.","twitter_misc":{"\uae00\uc4f4\uc774":"DATUMO","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"11\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"TechArticle","@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#article","isPartOf":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858"},"author":{"name":"DATUMO","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6"},"headline":"Core values of DATUMO","datePublished":"2022-05-24T08:50:40+00:00","dateModified":"2024-10-22T08:08:32+00:00","mainEntityOfPage":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858"},"wordCount":2521,"publisher":{"@id":"https:\/\/blog.datumo.com\/#organization"},"image":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage"},"thumbnailUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg","keywords":["AI","datasets","datumo","opendatasets"],"articleSection":["tech"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/blog.datumo.com\/en\/tech\/15858","url":"https:\/\/blog.datumo.com\/en\/tech\/15858","name":"Core values of DATUMO - DATUMO","isPartOf":{"@id":"https:\/\/blog.datumo.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage"},"image":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage"},"thumbnailUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg","datePublished":"2022-05-24T08:50:40+00:00","dateModified":"2024-10-22T08:08:32+00:00","description":"The trend has changed from focusing on AI models to focusing on quality datasets.-From model-centric to Data-centric AIWhat is \u201cquality data\u201d?David points out four main characteristics of quality data.","breadcrumb":{"@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/blog.datumo.com\/en\/tech\/15858"]}]},{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#primaryimage","url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg","contentUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/\u1109\u1166\u11af\u1109\u1173_Space_004.jpg","width":1920,"height":1080},{"@type":"BreadcrumbList","@id":"https:\/\/blog.datumo.com\/en\/tech\/15858#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/blog.datumo.com\/en\/"},{"@type":"ListItem","position":2,"name":"Core values of DATUMO"}]},{"@type":"WebSite","@id":"https:\/\/blog.datumo.com\/#website","url":"https:\/\/blog.datumo.com\/","name":"DATUMO","description":"The Data for Smarter AI","publisher":{"@id":"https:\/\/blog.datumo.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/blog.datumo.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/blog.datumo.com\/#organization","name":"DATUMO","url":"https:\/\/blog.datumo.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/","url":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp","contentUrl":"https:\/\/blog.datumo.com\/en\/wp-content\/uploads\/2022\/05\/2.1.webp","width":1080,"height":600,"caption":"DATUMO"},"image":{"@id":"https:\/\/blog.datumo.com\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/02ec2d0ba953b146878dab089dc735b6","name":"DATUMO","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/blog.datumo.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1942a8a63e1c8fa0d9be56cda789edd6c0a866259cd5dca24952597ffa8bab3d?s=96&d=mm&r=g","caption":"DATUMO"},"description":"DATUMO, The Data for Smarter AI. We seek to drive impact in the world by providing diverse and high quality data to build smarter AI.","sameAs":["https:\/\/blog.datumo.com\/en"],"url":"https:\/\/blog.datumo.com\/en\/author\/selectstar"}]}},"_links":{"self":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/15858","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/comments?post=15858"}],"version-history":[{"count":30,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/15858\/revisions"}],"predecessor-version":[{"id":16897,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/posts\/15858\/revisions\/16897"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/media\/15718"}],"wp:attachment":[{"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/media?parent=15858"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/categories?post=15858"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.datumo.com\/en\/wp-json\/wp\/v2\/tags?post=15858"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}