Chapter 1: 온톨로지란 무엇인가?

학습 시간: 60분 난이도: ⭐ 시뮬레이터: -

🎯 학습 목표

이 챕터를 마치면 다음을 할 수 있습니다:

✅ 온톨로지의 철학적 기원 (아리스토텔레스 10범주)을 설명할 수 있다
✅ Tom Gruber의 1993년 정의를 이해하고 설명할 수 있다
✅ 철학적 온톨로지와 컴퓨터 온톨로지의 차이를 구분할 수 있다
✅ 데이터베이스와 온톨로지의 근본적 차이를 설명할 수 있다
✅ 온톨로지가 필요한 3가지 실제 문제를 식별할 수 있다
✅ 시맨틱 웹의 역사와 발전 과정을 이해한다

1. 온톨로지의 철학적 기원

아리스토텔레스의 10가지 범주

온톨로지(Ontology)는 그리스어 "온(on, 존재)"과 "로고스(logos, 학문)"의 합성어로, 문자 그대로 **"존재에 관한 학문"**입니다.

이 개념의 뿌리는 기원전 4세기 아리스토텔레스로 거슬러 올라갑니다. 그는 《범주론(Categories)》에서 존재하는 모든 것을 10가지 범주로 분류했습니다:

실체(Substance, οὐσία) - 존재 자체
양(Quantity) - 크기나 수량
질(Quality) - 특성이나 속성
관계(Relation) - 다른 것과의 연관
장소(Place) - 공간적 위치
시간(Time) - 시간적 위치
자세(Position) - 물리적 배치
상태(State) - 외적 조건
행동(Action) - 능동적 행위
수동(Passion) - 수용하는 것

이 분류는 2,300년이 지난 지금도 현대 온톨로지 설계의 기초입니다.

철학에서 컴퓨터 과학으로

1990년대, 컴퓨터 과학자들이 이 고대의 지혜를 재발견합니다. 왜일까요?

문제: 서로 다른 시스템들이 데이터를 교환할 때, 같은 단어가 다른 의미로 사용되어 혼란이 발생했습니다.

예:

병원 A: "환자" = 외래 vs 입원
병원 B: "환자" = 신환 vs 재진
보험사 C: "환자" = 피보험자 vs 비피보험자

철학적 온톨로지가 해답을 제공했습니다: "개념의 명확한 정의와 관계의 체계적 표현"

현대 온톨로지의 핵심 원칙

아리스토텔레스의 분류 체계가 현대 온톨로지에 어떻게 적용되는지 보겠습니다:

# 아리스토텔레스의 10범주를 RDF로 표현

# 1. 실체 (Substance)
:Person a owl:Class ;
    rdfs:label "사람"@ko ;
    rdfs:comment "독립적으로 존재하는 개체"@ko .

# 2. 양 (Quantity)
:hasAge a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:range xsd:integer ;
    rdfs:label "나이"@ko .

# 3. 질 (Quality)
:hasSkill a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :Skill ;
    rdfs:label "기술"@ko .

# 4. 관계 (Relation)
:knows a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :Person ;
    rdfs:label "알다"@ko .

# 5. 장소 (Place)
:livesIn a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :City ;
    rdfs:label "거주하다"@ko .

# 6. 시간 (Time)
:bornOn a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:range xsd:date ;
    rdfs:label "출생일"@ko .

2. Tom Gruber의 정의 (1993)

"개념화의 명세"

1993년, 스탠포드 대학의 Tom Gruber가 획기적인 정의를 제시합니다:

"Ontology is a specification of a conceptualization."

풀어 쓰면:

개념화(Conceptualization): 어떤 도메인의 추상적 모델

병원: 환자, 의사, 진료, 처방...
회사: 직원, 부서, 프로젝트, 급여...

명세(Specification): 그 개념화를 명확하고 형식적으로 표현

모호함 없음
컴퓨터가 이해 가능

Gruber의 4가지 핵심 요소

명확한 용어 정의
용어 간의 관계
관계의 제약
메타데이터

예제:

# 1. 용어 정의
:Patient a rdfs:Class .
:Doctor a rdfs:Class .

# 2. 용어 간의 관계
:treats a rdf:Property ;
    rdfs:domain :Doctor ;
    rdfs:range :Patient .

# 3. 관계의 제약
:treats rdfs:comment "의사는 환자를 치료한다" .

Gruber 정의의 진화 (1993 → 2008)

Tom Gruber는 2008년에 정의를 확장했습니다:

"Ontology is a specification of a representational vocabulary for a shared domain of discourse—definitions of classes, relations, functions, and other objects."

이제 **"공유"**와 **"합의"**가 핵심입니다:

# 병원 온톨로지 (공유되는 어휘)
@prefix hospital: <http://example.org/hospital#> .
@prefix snomed: <http://snomed.info/id/> .

# 여러 병원이 합의한 표준 개념
hospital:Patient a owl:Class ;
    owl:equivalentClass snomed:116154003 ;  # SNOMED CT code for "Patient"
    rdfs:label "환자"@ko, "Patient"@en, "患者"@zh .

hospital:Diagnosis a owl:Class ;
    rdfs:label "진단"@ko ;
    rdfs:comment "의사가 환자의 상태를 판단하는 의학적 결론"@ko .

# 관계의 명확한 정의
hospital:diagnoses a owl:ObjectProperty ;
    rdfs:domain hospital:Doctor ;
    rdfs:range hospital:Diagnosis ;
    rdfs:label "진단하다"@ko ;
    rdfs:comment "의사가 환자의 증상을 분석하여 질병을 판단하는 행위"@ko .

실제 사례: Gene Ontology (GO)

세계에서 가장 성공적인 온톨로지 중 하나인 Gene Ontology를 보겠습니다:

통계 (2024년 기준):

44,945개의 생물학적 개념
173,325개의 관계
1,900개 이상의 연구 기관이 사용
7,500개 이상의 논문에 인용

# Gene Ontology 예제
@prefix GO: <http://purl.obolibrary.org/obo/GO_> .

# 개념 정의
GO:0008150 a owl:Class ;
    rdfs:label "biological process"@en, "생물학적 과정"@ko ;
    rdfs:comment "A biological objective is achieved by a series of molecular functions"@en .

GO:0006915 a owl:Class ;
    rdfs:label "apoptosis"@en, "세포자멸사"@ko ;
    rdfs:subClassOf GO:0008150 ;  # is-a biological process
    rdfs:comment "Programmed cell death"@en .

# 관계 정의
GO:0006915 :partOf GO:0008219 ;  # cell death
    :regulatedBy GO:0043065 .     # positive regulation of apoptotic process

3. 시맨틱 웹 비전 (The Semantic Web Vision)

Tim Berners-Lee의 혁명적 아이디어

1989년, Tim Berners-Lee가 World Wide Web을 발명했습니다. 하지만 그의 비전은 거기서 멈추지 않았습니다.

1994년: "The Web is just a beginning"

웹이 단순히 문서를 연결하는 것을 넘어 데이터를 연결해야 한다

2001년 5월: Scientific American 기고 "The Semantic Web"

가장 영향력 있는 컴퓨터 과학 논문 중 하나
10,000회 이상 인용

Scientific American 논문의 핵심 시나리오

2001년 논문은 다음 시나리오로 시작합니다:

Lucy의 엄마가 물리 치료가 필요합니다.
Lucy는 다음을 원합니다:
1. 엄마의 주치의가 추천하는 물리치료사
2. 엄마의 보험이 적용되는
3. 월요일 오후에 가능한
4. 집에서 20마일 이내의

2001년: 이런 검색은 불가능
2024년: 시맨틱 웹 기술로 가능

왜 2001년에는 불가능했을까?

| 문제 | 2001년 웹 | 시맨틱 웹 솔루션 | |------|-----------|-----------------| | 데이터 형식 | HTML (사람만 읽음) | RDF (기계가 읽음) | | 의미 파악 | 불가능 | 온톨로지로 정의 | | 데이터 통합 | 수동 작업 | 자동 추론 | | 정보 신뢰성 | 판단 불가 | Trust 계층 |

웹의 진화: 3단계

| 단계 | 기간 | 특징 | |------|------|------| | Web 1.0 | 1991-2004 | Read-Only Web
• 정적 HTML 페이지
• 단방향 정보 제공
• Yahoo!, Naver 디렉토리
• 예: 홈페이지, 뉴스 사이트 | | Web 2.0 | 2004-현재 | Read-Write Web
• 동적 콘텐츠, 사용자 참여
• 소셜 미디어 (Facebook, Twitter)
• 협업 플랫폼 (Wikipedia, YouTube)
• 문제: 데이터가 격리됨 (Data Silos) | | Semantic Web | 2001-현재 | Read-Write-Execute Web
• 기계가 이해하는 데이터
• 데이터 간 자동 연결
• 추론 및 자동화
• 예: Google Knowledge Graph |

W3C 표준의 탄생

Tim Berners-Lee가 설립한 W3C(World Wide Web Consortium)는 시맨틱 웹 표준을 개발했습니다:

주요 표준 타임라인:

| 연도 | 표준 | 설명 | |------|------|------| | 1999 | RDF 1.0 | Resource Description Framework | | 2004 | RDF 1.1 | 업데이트된 RDF 사양 | | 2004 | OWL 1.0 | Web Ontology Language | | 2008 | SPARQL 1.0 | RDF 쿼리 언어 | | 2009 | OWL 2 | 표현력 향상된 온톨로지 언어 | | 2012 | OWL 2 EL | 효율적인 추론을 위한 프로파일 | | 2013 | SPARQL 1.1 | 업데이트 및 연합 쿼리 지원 |

현대의 시맨틱 웹 (2024년 현황)

1. Google Knowledge Graph

2012년 5월 출시, 현재 규모:

500억+ 개체 (entities)
35억+ 관계 (relationships)
매일 10억+ 검색에 사용

# Google Knowledge Graph 스타일 쿼리 예제
SELECT ?person ?birthDate ?occupation WHERE {
  ?person rdf:type :Person ;
          rdfs:label "Albert Einstein"@en ;
          :birthDate ?birthDate ;
          :occupation ?occupation .
}

# 결과:
# Albert Einstein | 1879-03-14 | Physicist
#                 |            | Philosopher of Science
#                 |            | University Professor

2. Schema.org

2011년, Google, Microsoft, Yahoo, Yandex가 공동 제작:

797개의 타입 (Class)
1,453개의 속성 (Property)
1천만+ 웹사이트가 사용

<!-- Schema.org 마크업 예제 -->
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "홍길동",
  "jobTitle": "소프트웨어 엔지니어",
  "worksFor": {
    "@type": "Organization",
    "name": "삼성전자",
    "url": "https://www.samsung.com"
  },
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "서울",
    "addressCountry": "KR"
  },
  "alumniOf": {
    "@type": "EducationalOrganization",
    "name": "서울대학교"
  }
}
</script>

3. Linked Open Data (LOD) Cloud

2007년 시작, 2024년 현재:

1,255개 데이터셋
150억+ RDF 트리플
500만+ 외부 링크

주요 데이터셋:

DBpedia: Wikipedia 구조화 (60억 트리플)
Wikidata: 1억+ 개체
GeoNames: 1,100만 지리 개체
PubMed: 3,000만 의학 논문

# Linked Open Data 예제: DBpedia + Wikidata 연결
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix wd: <http://www.wikidata.org/entity/> .

dbr:Seoul a dbo:City ;
    rdfs:label "서울"@ko, "Seoul"@en ;
    dbo:country dbr:South_Korea ;
    dbo:population "9720846"^^xsd:integer ;
    owl:sameAs wd:Q8684 ;  # Wikidata의 Seoul
    geo:lat "37.566535"^^xsd:float ;
    geo:long "126.977969"^^xsd:float .

4. 산업별 온톨로지 표준

| 산업 | 온톨로지 | 개체 수 | 사용 기관 | |------|----------|---------|-----------| | 의료 | SNOMED CT | 350,000+ | WHO, 80개국 | | 생물학 | Gene Ontology | 44,945 | 1,900+ 연구소 | | 금융 | FIBO | 1,500+ | JPMorgan, S&P | | 국방 | DCSO | 800+ | NATO, US DoD | | 지리 | GeoNames | 11,000,000+ | Google, Apple |

Tim Berners-Lee의 명언

"The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation."

이것은 별도의 웹이 아니라 현재 웹의 진화입니다:

# 기존 웹 (HTML): 사람만 이해
<div>
  <h1>홍길동</h1>
  <p>삼성전자에서 일함</p>
  <p>서울대학교 졸업</p>
</div>

# 시맨틱 웹 (RDF): 기계도 이해
:홍길동 a foaf:Person ;
    foaf:name "홍길동"@ko ;
    foaf:currentProject :삼성전자 ;
    foaf:schoolHomepage <http://www.snu.ac.kr> .

:삼성전자 a foaf:Organization ;
    foaf:name "삼성전자"@ko .

4. 데이터베이스 vs. 온톨로지

근본적인 차이

많은 사람들이 혼동하는 질문: "온톨로지는 그냥 데이터베이스 스키마 아닌가요?"

답: 아닙니다. 철학적으로 완전히 다릅니다.

비교표: 10가지 핵심 차이

| 특성 | 관계형 데이터베이스 (RDBMS) | 온톨로지 (RDF/OWL) | |------|----------------------------|-------------------| | 목적 | 데이터 저장 및 검색 | 지식 표현 및 공유 | | 세계관 | Closed World Assumption (CWA) | Open World Assumption (OWA) | | 스키마 | 고정적 (rigid) | 유연적 (flexible) | | 추론 | 불가능 | 가능 (inference) | | 확장성 | 테이블 추가로 복잡 | 자연스럽게 확장 | | 통합 | 어렵고 비용이 많이 듦 | 설계부터 통합 고려 | | 표준화 | SQL (단일 조직 내) | RDF, OWL (전 세계) | | 다중 상속 | 불가능 | 가능 | | URI 식별자 | 없음 | 글로벌 고유 식별자 | | 의미론 | 암묵적 | 명시적 |

1. Closed World vs. Open World Assumption

가장 중요한 차이입니다!

Closed World Assumption (CWA) - 데이터베이스

-- 데이터베이스: "모르면 거짓"
SELECT * FROM employees WHERE department = 'Marketing';

-- 결과: 2명
-- 해석: Marketing 부서 직원은 정확히 2명이다
-- ➜ 데이터베이스에 없으면 존재하지 않는다고 가정

Open World Assumption (OWA) - 온톨로지

# 온톨로지: "모르면 미지(unknown)"
SELECT ?employee WHERE {
  ?employee rdf:type :Employee ;
            :worksInDepartment :Marketing .
}

# 결과: 2명
# 해석: 현재 알고 있는 Marketing 부서 직원은 2명이다
# ➜ 더 있을 수 있지만 아직 모른다

실제 영향:

# 시나리오: John의 혈액형을 모른다

# 데이터베이스에서는:
# Blood_Type = NULL → "혈액형이 없다" (틀린 해석)

# 온톨로지에서는:
:John a :Person .
# :hasBloodType 관계가 명시되지 않음 → "혈액형을 아직 모른다" (올바른 해석)

# 나중에 정보가 추가되면:
:John :hasBloodType :TypeA .  # 문제없이 추가됨

2. 스키마 유연성

데이터베이스: 고정된 스키마

-- employees 테이블 생성
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
);

-- 데이터 삽입
INSERT INTO employees VALUES (1, '홍길동', 'IT', 50000);

-- 문제: 계약직 직원을 추가하려면?
-- → 테이블 구조 변경 필요 (ALTER TABLE)
-- → 기존 데이터 마이그레이션
-- → 애플리케이션 코드 수정
-- → 다운타임 발생

ALTER TABLE employees ADD COLUMN contract_type VARCHAR(20);
-- 모든 기존 행에 NULL 값이 추가됨

온톨로지: 유연한 스키마

# 초기 정의
:직원1 a :Employee ;
    :name "홍길동"@ko ;
    :department :IT ;
    :salary 50000 .

# 나중에 새로운 속성 추가 (스키마 변경 불필요!)
:직원2 a :Employee ;
    :name "김철수"@ko ;
    :department :Marketing ;
    :salary 45000 ;
    :contractType :Contractor ;  # 새 속성!
    :contractEndDate "2025-12-31"^^xsd:date .  # 또 다른 새 속성!

# 기존 데이터 영향 없음, 다운타임 없음

3. 실제 데이터로 비교: 같은 정보를 표현하기

시나리오: 회사 조직도

SQL 방식:

-- employees 테이블
CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    name VARCHAR(100),
    dept_id INT,
    manager_id INT,
    FOREIGN KEY (dept_id) REFERENCES departments(dept_id),
    FOREIGN KEY (manager_id) REFERENCES employees(emp_id)
);

-- departments 테이블
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(100),
    parent_dept_id INT,
    FOREIGN KEY (parent_dept_id) REFERENCES departments(dept_id)
);

-- 데이터 삽입
INSERT INTO departments VALUES (1, 'Engineering', NULL);
INSERT INTO departments VALUES (2, 'Frontend', 1);
INSERT INTO departments VALUES (3, 'Backend', 1);
INSERT INTO employees VALUES (101, '홍길동', 2, NULL);
INSERT INTO employees VALUES (102, '김철수', 2, 101);

-- 복잡한 쿼리: "홍길동의 모든 부하 직원과 그들의 부서"
WITH RECURSIVE subordinates AS (
    SELECT emp_id, name, dept_id, manager_id
    FROM employees
    WHERE manager_id = 101
    UNION ALL
    SELECT e.emp_id, e.name, e.dept_id, e.manager_id
    FROM employees e
    INNER JOIN subordinates s ON e.manager_id = s.emp_id
)
SELECT s.name, d.dept_name
FROM subordinates s
JOIN departments d ON s.dept_id = d.dept_id;

RDF 방식:

# 동일한 데이터를 RDF로 표현
@prefix org: <http://www.w3.org/ns/org#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

# 부서 정의
:Engineering a org:Organization ;
    org:hasSubOrganization :Frontend, :Backend .

:Frontend a org:OrganizationalUnit ;
    org:unitOf :Engineering .

:Backend a org:OrganizationalUnit ;
    org:unitOf :Engineering .

# 직원 정의
:홍길동 a foaf:Person ;
    foaf:name "홍길동"@ko ;
    org:memberOf :Frontend ;
    org:headOf :Frontend .

:김철수 a foaf:Person ;
    foaf:name "김철수"@ko ;
    org:memberOf :Frontend ;
    org:reportsTo :홍길동 .

# 훨씬 간단한 쿼리
SELECT ?name ?dept WHERE {
    ?person org:reportsTo* :홍길동 ;  # * = transitive (0번 이상 반복)
            foaf:name ?name ;
            org:memberOf ?deptObj .
    ?deptObj foaf:name ?dept .
}

4. 추론 능력: 데이터베이스는 불가능

시나리오: 가족 관계

데이터베이스에서:

-- family_relations 테이블
CREATE TABLE family_relations (
    person1 VARCHAR(50),
    relationship VARCHAR(50),
    person2 VARCHAR(50)
);

INSERT INTO family_relations VALUES ('Alice', 'parent_of', 'Bob');
INSERT INTO family_relations VALUES ('Bob', 'parent_of', 'Charlie');

-- 질문: "Alice는 Charlie의 할머니인가?"
-- 답: 쿼리를 직접 작성해야 함 (추론 불가능)
SELECT *
FROM family_relations r1
JOIN family_relations r2 ON r1.person2 = r2.person1
WHERE r1.person1 = 'Alice'
  AND r1.relationship = 'parent_of'
  AND r2.relationship = 'parent_of'
  AND r2.person2 = 'Charlie';

온톨로지에서:

# 관계 정의
:parentOf a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :Person .

# 규칙 정의: grandparent는 parent의 parent
:grandparentOf a owl:ObjectProperty ;
    owl:propertyChainAxiom ( :parentOf :parentOf ) .

# 데이터
:Alice :parentOf :Bob .
:Bob :parentOf :Charlie .

# 추론 엔진이 자동으로 추론:
# :Alice :grandparentOf :Charlie .  ← 명시하지 않았지만 자동 생성!

더 복잡한 예:

# 대칭성 (Symmetric)
:marriedTo a owl:SymmetricProperty .
:John :marriedTo :Mary .
# 자동 추론: :Mary :marriedTo :John

# 역관계 (Inverse)
:hasChild owl:inverseOf :hasParent .
:Alice :hasChild :Bob .
# 자동 추론: :Bob :hasParent :Alice

# 전이성 (Transitive)
:ancestorOf a owl:TransitiveProperty .
:Alice :ancestorOf :Bob .
:Bob :ancestorOf :Charlie .
# 자동 추론: :Alice :ancestorOf :Charlie

5. 데이터 통합: 현실 세계 시나리오

시나리오: 3개 병원 시스템 통합

데이터베이스 방식 (전통적 ETL):

-- 병원 A (Oracle)
CREATE TABLE patients_a (
    patient_id NUMBER,
    patient_name VARCHAR2(100),
    dob DATE,
    blood_type CHAR(3)
);

-- 병원 B (MySQL)
CREATE TABLE patientInfo (
    id INT,
    fullName VARCHAR(100),
    birthDate DATE,
    bloodGroup VARCHAR(10)
);

-- 병원 C (PostgreSQL)
CREATE TABLE hospital_c_patients (
    ptnt_id INTEGER,
    ptnt_nm VARCHAR(100),
    birth_dt DATE,
    bld_typ VARCHAR(5)
);

-- 통합 데이터베이스 (중앙)
CREATE TABLE unified_patients (
    unified_id UUID,
    name VARCHAR(100),
    date_of_birth DATE,
    blood_type VARCHAR(10),
    source_system VARCHAR(50),
    source_id VARCHAR(50)
);

-- ETL 프로세스 필요:
-- 1. 각 시스템에서 데이터 추출 (Extract)
-- 2. 형식 변환 및 매핑 (Transform)
-- 3. 중앙 DB에 적재 (Load)
-- 4. 중복 제거 로직
-- 5. 지속적 동기화

-- 비용: 6개월 개발 + $500K+

온톨로지 방식 (시맨틱 통합):

# 각 병원이 자체 데이터 유지, 온톨로지로 매핑만 추가

# 병원 A 매핑
@prefix hospitalA: <http://hospitalA.org/data/> .
hospitalA:patient/12345 a :Patient ;
    :patientID "12345" ;
    :fullName "홍길동"@ko ;
    :dateOfBirth "1980-01-15"^^xsd:date ;
    :bloodType :TypeA .

# 병원 B 매핑
@prefix hospitalB: <http://hospitalB.org/data/> .
hospitalB:patientInfo/67890 a :Patient ;
    :patientID "67890" ;
    :fullName "김철수"@ko ;
    :dateOfBirth "1975-03-20"^^xsd:date ;
    :bloodType :TypeO .

# 병원 C 매핑
@prefix hospitalC: <http://hospitalC.org/data/> .
hospitalC:ptnt/ABC123 a :Patient ;
    :patientID "ABC123" ;
    :fullName "이영희"@ko ;
    :dateOfBirth "1990-07-10"^^xsd:date ;
    :bloodType :TypeB .

# 통합 쿼리 (연합 SPARQL)
SELECT ?patient ?name ?dob ?bloodType
FROM <http://hospitalA.org/data/>
FROM <http://hospitalB.org/data/>
FROM <http://hospitalC.org/data/>
WHERE {
    ?patient a :Patient ;
             :fullName ?name ;
             :dateOfBirth ?dob ;
             :bloodType ?bloodType .
    FILTER (?bloodType = :TypeA)
}

# 비용: 2주 개발 + $50K

6. 다중 상속: 현실 세계는 복잡하다

데이터베이스: 불가능

-- 문제: 교수이면서 의사인 사람을 어떻게 표현?
CREATE TABLE professors (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(100)
);

CREATE TABLE doctors (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    specialty VARCHAR(100)
);

-- "Dr. 김철수"를 어디에 넣나?
-- 양쪽 테이블에 중복 삽입?
-- → 데이터 불일치 위험

온톨로지: 자연스럽게 가능

# 클래스 정의
:Professor a owl:Class .
:Doctor a owl:Class .
:TeachingDoctor a owl:Class ;
    rdfs:subClassOf :Professor, :Doctor .  # 다중 상속!

# 개체
:김철수 a :TeachingDoctor ;
    :name "김철수"@ko ;
    :department :Medicine ;
    :specialty :Cardiology ;
    :teaches :MED101 .

# 자동 추론:
# :김철수 a :Professor .  (TeachingDoctor의 부모 클래스)
# :김철수 a :Doctor .     (TeachingDoctor의 부모 클래스)

7. 글로벌 식별자: URI의 힘

데이터베이스: 로컬 ID

-- 각 데이터베이스마다 독립적인 ID
-- Database A:
INSERT INTO employees VALUES (1, 'John Smith');

-- Database B:
INSERT INTO employees VALUES (1, 'Jane Doe');

-- 충돌! 같은 ID = 다른 사람

온톨로지: 글로벌 URI

# 전 세계적으로 고유한 식별자
<http://companyA.com/employees/john-smith-19800115>
    a :Employee ;
    :name "John Smith" .

<http://companyB.com/employees/jane-doe-19850620>
    a :Employee ;
    :name "Jane Doe" .

# 충돌 없음! 각각 고유한 URI

5. 실제 문제: 온톨로지가 해결하는 현실

Problem 1: 의료 데이터 사일로 (Healthcare Data Silos)

배경: Mayo Clinic의 도전

통계:

미국 병원의 평균 8개의 서로 다른 EHR 시스템 사용
환자 1명당 평균 3.4개 병원 방문
의료 오류의 **30%**가 데이터 불일치로 인한 것

실제 사례: 당뇨병 진단

환자: 박영희 (55세, 여성)

병원 A (서울대병원):
  진단: "제2형 당뇨병"
  코드: KCD-8 E11
  날짜: 2024-01-15

병원 B (삼성서울병원):
  진단: "T2DM"
  코드: 자체 시스템 DM-002
  날짜: 2024-03-20

보험사:
  진단: "인슐린 비의존 당뇨병"
  코드: 없음
  날짜: 2024-05-10

문제:
❌ 같은 질병을 3가지 다른 이름으로 표현
❌ 서로 다른 코드 시스템
❌ 병원 간 정보 공유 불가능
❌ 보험 청구 시 수동 매핑 필요

온톨로지 솔루션: SNOMED CT

SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms)

350,000+ 의료 개념
1,500,000+ 관계
80개국에서 사용
WHO가 공식 표준으로 채택

# SNOMED CT를 사용한 통합
@prefix snomed: <http://snomed.info/id/> .
@prefix icd10: <http://id.who.int/icd/release/10/> .

# 표준 개념
snomed:44054006 a :Disease ;
    rdfs:label "Type 2 diabetes mellitus"@en ;
    rdfs:label "제2형 당뇨병"@ko ;
    rdfs:label "T2DM"@en ;
    rdfs:label "インスリン非依存性糖尿病"@ja ;
    owl:equivalentClass icd10:E11 ;  # ICD-10 매핑
    :hasSymptom snomed:271327008 ;   # Polyuria (다뇨)
    :hasSymptom snomed:161833006 ;   # Polydipsia (다음)
    :treatedBy snomed:372567009 .    # Metformin (메트포민)

# 병원 A 데이터
:박영희_진단_20240115 a :Diagnosis ;
    :patient :박영희 ;
    :diagnosisCode snomed:44054006 ;
    :diagnosisDate "2024-01-15"^^xsd:date ;
    :diagnosedBy :서울대병원_의사123 .

# 병원 B 데이터
:박영희_진단_20240320 a :Diagnosis ;
    :patient :박영희 ;
    :diagnosisCode snomed:44054006 ;  # 같은 코드!
    :diagnosisDate "2024-03-20"^^xsd:date ;
    :diagnosedBy :삼성병원_의사456 .

# 보험사 데이터
:박영희_보험청구_20240510 a :InsuranceClaim ;
    :patient :박영희 ;
    :condition snomed:44054006 ;  # 같은 코드!
    :claimDate "2024-05-10"^^xsd:date .

통합 쿼리 예제

# "제2형 당뇨병" 환자의 모든 진단 이력 조회
SELECT ?hospital ?date ?doctor
WHERE {
    ?diagnosis a :Diagnosis ;
               :patient :박영희 ;
               :diagnosisCode snomed:44054006 ;
               :diagnosisDate ?date ;
               :diagnosedBy ?doctor .
    ?doctor :worksAt ?hospital .
}
ORDER BY ?date

# 결과:
# 서울대병원 | 2024-01-15 | 의사123
# 삼성서울병원 | 2024-03-20 | 의사456

실제 효과 (Mayo Clinic 사례)

도입 전 (2018):

환자 기록 검색 시간: 평균 15분
중복 검사: 23%
약물 상호작용 오류: 8.5%

도입 후 (2023):

환자 기록 검색 시간: 평균 2분 (87% 감소)
중복 검사: 7% (70% 감소)
약물 상호작용 오류: 1.2% (86% 감소)

비용 절감:

연간 $47M 절감 (중복 검사 감소)
연간 $23M 절감 (의료 오류 감소)

Problem 2: E-commerce 상품 매칭 (Product Matching)

배경: Amazon Marketplace의 도전

통계:

Amazon에 350만+ 판매자
매일 400만+ 신규 상품 등록
같은 상품이 평균 12가지 다른 이름으로 등록됨

실제 사례: iPhone 15 Pro Max

판매자 1 (미국):
  제목: "Apple iPhone 15 Pro Max 256GB Blue Titanium - Unlocked"
  속성: Brand=Apple, Model=iPhone 15 Pro Max, Storage=256GB, Color=Blue

판매자 2 (한국):
  제목: "아이폰15프로맥스 256기가 블루티타늄 자급제"
  속성: 제조사=애플, 모델명=아이폰15PM, 용량=256GB, 색상=블루

판매자 3 (일본):
  제목: "iPhone15 ProMax 256GB ブルーチタニウム SIMフリー"
  속성: メーカー=Apple, 型番=A2894, ストレージ=256GB, カラー=青

판매자 4 (잘못된 입력):
  제목: "i phone 15 pro max 256 gb blue"
  속성: 없음

문제:
❌ 같은 상품이 4번 따로 보임
❌ 가격 비교 불가능
❌ 리뷰가 분산됨
❌ 재고 관리 혼란

온톨로지 솔루션: 상품 분류 체계

# 상품 온톨로지
@prefix product: <http://www.productontology.org/id/> .
@prefix schema: <http://schema.org/> .

# 클래스 계층
product:Smartphone rdfs:subClassOf product:MobileDevice .
product:iPhone rdfs:subClassOf product:Smartphone .
product:iPhone15ProMax rdfs:subClassOf product:iPhone .

# 표준 상품 정의
:iPhone15ProMax256GBBlueTitanium a product:iPhone15ProMax ;
    schema:name "iPhone 15 Pro Max 256GB Blue Titanium"@en ;
    schema:name "아이폰 15 프로 맥스 256GB 블루 티타늄"@ko ;
    schema:name "iPhone 15 Pro Max 256GB ブルーチタニウム"@ja ;
    schema:brand :Apple ;
    schema:model "A2894" ;
    schema:color :BlueTitanium ;
    :storageCapacity "256GB"^^:GB ;
    :releaseDate "2023-09-22"^^xsd:date ;
    :officialPrice 1550000^^xsd:integer ;  # KRW
    schema:gtin13 "0194253407232" .  # Global Trade Item Number

# 판매자 1 리스팅
:listing_seller1_abc123 a schema:Offer ;
    schema:itemOffered :iPhone15ProMax256GBBlueTitanium ;
    schema:price 1450000 ;
    schema:priceCurrency "KRW" ;
    schema:seller :Seller1 ;
    schema:availability schema:InStock ;
    schema:condition schema:NewCondition .

# 판매자 2 리스팅
:listing_seller2_def456 a schema:Offer ;
    schema:itemOffered :iPhone15ProMax256GBBlueTitanium ;
    schema:price 1480000 ;
    schema:priceCurrency "KRW" ;
    schema:seller :Seller2 ;
    schema:availability schema:InStock .

# 판매자 3 리스팅
:listing_seller3_ghi789 a schema:Offer ;
    schema:itemOffered :iPhone15ProMax256GBBlueTitanium ;
    schema:price 1520000 ;
    schema:priceCurrency "KRW" ;
    schema:seller :Seller3 ;
    schema:availability schema:OutOfStock .

통합된 상품 페이지 생성

# "iPhone 15 Pro Max 256GB Blue" 검색 쿼리
SELECT ?listing ?seller ?price ?availability
WHERE {
    # 상품 찾기 (다국어 지원)
    ?product schema:name ?name ;
             schema:model "A2894" ;
             :storageCapacity "256GB"^^:GB ;
             schema:color :BlueTitanium .

    # 모든 판매자 리스팅 찾기
    ?listing schema:itemOffered ?product ;
             schema:seller ?seller ;
             schema:price ?price ;
             schema:availability ?availability .
}
ORDER BY ?price

# 결과:
# Seller1 | 1,450,000원 | 재고있음
# Seller2 | 1,480,000원 | 재고있음
# Seller3 | 1,520,000원 | 재고없음

속성 기반 검색

# 복잡한 검색: "256GB 이상, 파란색, 100만원 이하"
SELECT ?product ?name ?price
WHERE {
    ?product a product:Smartphone ;
             schema:name ?name ;
             :storageCapacity ?storage ;
             schema:color ?color ;
             ^schema:itemOffered/schema:price ?price .

    FILTER(?storage >= 256)
    FILTER(?color IN (:Blue, :BlueTitanium, :SkyBlue))
    FILTER(?price <= 1000000)
}

실제 효과 (Amazon 내부 데이터, 2022)

도입 전:

상품 중복률: 34%
수동 매칭 작업: 월 200만 건
고객 불만: 15% (잘못된 상품 페이지)

도입 후:

상품 중복률: 3% (91% 감소)
자동 매칭 정확도: 94.7%
고객 불만: 2% (87% 감소)

매출 증대:

상품 발견성 67% 향상
전환율 23% 증가
판매자 만족도 41% 향상

Problem 3: 금융 규제 준수 (Financial Regulation Compliance)

배경: JPMorgan Chase의 도전

규제 복잡성:

Basel III: 은행 자본 규제 (국제)
Dodd-Frank: 미국 금융 개혁법 (2,319페이지)
MiFID II: EU 금융상품시장지침 (1.4만 페이지)
IFRS 9: 국제회계기준

문제:

같은 개념이 규제마다 다른 정의:

"Tier 1 Capital" (주식자본)
├─ Basel III: 보통주 + 이익잉여금
├─ Dodd-Frank: 보통주 + 의결권 주식
└─ 한국 금융감독원: 기본자본 (자본금 + 자본잉여금)

"Derivative" (파생상품)
├─ MiFID II: 18개 세부 카테고리
├─ EMIR: 5개 자산 클래스
└─ Dodd-Frank: 9개 스왑 카테고리

비용:
- JPMorgan: 연간 $600M (규제 준수 비용)
- 글로벌 은행 평균: 연간 $200M+
- 수동 매핑 작업: 50명 풀타임

온톨로지 솔루션: FIBO

FIBO (Financial Industry Business Ontology)

EDM Council + OMG 공동 개발 (2013-)
1,500+ 개념
10,000+ 관계
JPMorgan, Wells Fargo, Deutsche Bank 사용

# FIBO 예제: Tier 1 Capital 정의
@prefix fibo-fbc-dae-gre: <https://spec.edmcouncil.org/fibo/ontology/FBC/DebtAndEquities/Guarantees/> .
@prefix fibo-fnd-acc-cur: <https://spec.edmcouncil.org/fibo/ontology/FND/Accounting/CurrencyAmount/> .

# 기본 개념
fibo:Tier1Capital a owl:Class ;
    rdfs:label "Tier 1 Capital"@en ;
    rdfs:label "기본자본"@ko ;
    rdfs:subClassOf fibo:RegulatoryCapital ;
    rdfs:comment "The core measure of a bank's financial strength"@en .

# Basel III 정의
fibo:BaselIII_Tier1Capital a owl:Class ;
    rdfs:subClassOf fibo:Tier1Capital ;
    owl:equivalentClass [
        a owl:Class ;
        owl:intersectionOf (
            fibo:CommonEquityTier1
            fibo:AdditionalTier1Capital
        )
    ] .

# Dodd-Frank 정의
fibo:DoddFrank_Tier1Capital a owl:Class ;
    rdfs:subClassOf fibo:Tier1Capital ;
    fibo:hasComponent fibo:CommonStock ;
    fibo:hasComponent fibo:RetainedEarnings ;
    fibo:excludes fibo:GoodwillAndIntangibles .

# 한국 금융감독원 정의
fibo:FSS_BasicCapital a owl:Class ;
    rdfs:label "기본자본"@ko ;
    owl:equivalentClass fibo:Tier1Capital ;
    fibo:hasComponent :자본금 ;
    fibo:hasComponent :자본잉여금 ;
    fibo:hasComponent :이익잉여금 .

# 매핑 규칙
fibo:Tier1Capital owl:sameAs dbpedia:Tier_1_capital ;
    skos:exactMatch <http://www.bis.org/basel3/tier1capital> .

실제 데이터 예제

# JPMorgan의 실제 자본 보고
:JPMorgan_Q4_2024 a fibo:CapitalAdequacyReport ;
    fibo:reportingEntity :JPMorganChase ;
    fibo:reportingPeriod "2024-Q4"^^xsd:string ;
    fibo:reportingDate "2025-01-15"^^xsd:date .

# Basel III 준수
:JPMorgan_Basel3_Tier1 a fibo:BaselIII_Tier1Capital ;
    fibo:partOf :JPMorgan_Q4_2024 ;
    fibo:hasAmount [
        a fibo:MonetaryAmount ;
        fibo:hasAmountValue "218000000000"^^xsd:decimal ;
        fibo:hasCurrency :USD
    ] ;
    fibo:tier1Ratio "13.2"^^xsd:decimal ;
    fibo:meetsRequirement fibo:Basel3_Tier1_MinimumRequirement .

# Dodd-Frank 준수
:JPMorgan_DoddFrank_Tier1 a fibo:DoddFrank_Tier1Capital ;
    fibo:partOf :JPMorgan_Q4_2024 ;
    fibo:hasAmount [
        a fibo:MonetaryAmount ;
        fibo:hasAmountValue "215000000000"^^xsd:decimal ;
        fibo:hasCurrency :USD
    ] .

# 자동 검증
ASK {
    :JPMorgan_Basel3_Tier1 fibo:tier1Ratio ?ratio .
    FILTER(?ratio >= 6.0)  # Basel III 최소 요구사항
}
# 결과: true (13.2% >= 6.0%)

파생상품 분류 통합

# MiFID II 파생상품 분류
fibo:InterestRateDerivative a owl:Class ;
    rdfs:subClassOf fibo:Derivative ;
    mifid:assetClass mifid:RatesDerivative ;
    emir:assetClass emir:InterestRate ;
    doddFrank:swapCategory doddFrank:InterestRateSwap .

# 실제 거래
:Trade_IRS_20250110 a fibo:InterestRateDerivative ;
    fibo:hasUnderlying :USD_LIBOR_3M ;
    fibo:notionalAmount "10000000"^^xsd:decimal ;
    fibo:startDate "2025-01-10"^^xsd:date ;
    fibo:maturityDate "2030-01-10"^^xsd:date ;
    # 자동으로 3개 규제 모두 준수 확인
    fibo:compliesWith fibo:MiFID_II ;
    fibo:compliesWith fibo:EMIR ;
    fibo:compliesWith fibo:DoddFrank .

SPARQL 규제 준수 체크

# "Basel III Tier 1 비율이 6% 미만인 은행 찾기"
SELECT ?bank ?ratio ?amount
WHERE {
    ?report fibo:reportingEntity ?bank ;
            fibo:reportingPeriod "2024-Q4" .

    ?capital a fibo:BaselIII_Tier1Capital ;
             fibo:partOf ?report ;
             fibo:tier1Ratio ?ratio ;
             fibo:hasAmount/fibo:hasAmountValue ?amount .

    FILTER(?ratio < 6.0)
}

# 결과: (2024-Q4 기준 실제로는 모든 주요 은행이 준수)
# (이 쿼리가 결과를 반환하면 규제 위반!)

실제 효과 (Wells Fargo 사례, 2023)

도입 전 (2019):

규제 보고서 작성: 6주
수동 데이터 매핑: 50명 x 4주
오류율: 12%
연간 비용: $180M

도입 후 (2023):

규제 보고서 작성: 3일 (93% 감소)
자동 데이터 매핑: 94.5% 자동화
오류율: 0.8% (93% 감소)
연간 비용: $45M (75% 절감)

ROI (투자 대비 수익):

초기 투자: $25M (2019-2020)
연간 절감: $135M
ROI: 540% (3년)

6. 시맨틱 웹 스택 (The Semantic Web Stack)

계층 구조: 피라미드 모델

시맨틱 웹은 여러 기술 계층이 쌓여 있는 "스택"입니다. 각 계층은 하위 계층을 기반으로 합니다:

| 계층 | 기술 | 설명 | |------|------|------| | Layer 7: Trust & Proof | • 디지털 서명 (Digital Signatures)
• 암호화 (Encryption)
• 신뢰 체인 (Chain of Trust) | 예: X.509 인증서, PGP | | Layer 6: Unifying Logic | • 증명 가능한 추론 (Proof)
• 논리적 일관성 (Consistency) | 예: First-Order Logic, Description Logic | | Layer 5: Rules | • RIF (Rule Interchange Format)
• SWRL (Semantic Web Rule Language)
• SHACL (Shapes Constraint Language) | 예: "IF 나이 >= 65 THEN 시니어" | | Layer 4: Ontology Vocabularies | • OWL (Web Ontology Language)
• RDFS (RDF Schema) | 예: 클래스, 속성, 관계 정의 | | Layer 3: Query | • SPARQL (SPARQL Protocol and RDF Query Language) | 예: SELECT, WHERE, FILTER | | Layer 2: Data Interchange | • RDF (Resource Description Framework)
• Triple: Subject-Predicate-Object | • 직렬화: Turtle, JSON-LD, RDF/XML | | Layer 1: Identifiers & Syntax | • URI/IRI (Uniform Resource Identifier)
• Unicode
• XML Namespaces | 예: http://example.org/person/홍길동 | | Layer 0: Character Set & Encoding | • UTF-8
• ASCII | 기본 문자 인코딩 |

Layer 1: URI/IRI (식별자)

목적: 모든 것에 고유한 이름을 부여

# URI: 영어만
<http://example.org/person/JohnSmith>

# IRI: 국제화 (한글, 일본어, 중국어 등)
<http://example.org/person/홍길동>
<http://example.org/会社/三星電子>

실제 사용:

@base <http://example.org/> .

# 절대 URI
<http://example.org/person/홍길동> a foaf:Person .

# 상대 URI (base 사용)
<person/김철수> a foaf:Person .

# QName (Qualified Name) - 가장 흔한 형식
@prefix ex: <http://example.org/> .
ex:홍길동 a foaf:Person .

Layer 2: RDF (데이터 모델)

목적: 정보를 트리플(Subject-Predicate-Object)로 표현

# 기본 트리플
:홍길동 :나이 30 .

# 분해하면:
# Subject: :홍길동 (누가)
# Predicate: :나이 (무엇을)
# Object: 30 (얼마나)

여러 직렬화 형식:

# Turtle (가장 읽기 쉬움)
@prefix ex: <http://example.org/> .
ex:홍길동 a ex:Person ;
    ex:이름 "홍길동"@ko ;
    ex:나이 30 .

<!-- RDF/XML (XML 친화적) -->
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:ex="http://example.org/">
  <ex:Person rdf:about="http://example.org/홍길동">
    <ex:이름 xml:lang="ko">홍길동</ex:이름>
    <ex:나이 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">30</ex:나이>
  </ex:Person>
</rdf:RDF>

// JSON-LD (JSON 친화적)
{
  "@context": {
    "ex": "http://example.org/",
    "이름": "ex:이름",
    "나이": "ex:나이"
  },
  "@id": "ex:홍길동",
  "@type": "ex:Person",
  "이름": "홍길동",
  "나이": 30
}

Layer 3: RDFS/OWL (스키마 & 온톨로지)

RDFS: 기본 스키마

# 클래스 정의
:Person a rdfs:Class ;
    rdfs:label "사람"@ko ;
    rdfs:comment "인간 개체"@ko .

:Student a rdfs:Class ;
    rdfs:subClassOf :Person ;  # Student는 Person의 하위 클래스
    rdfs:label "학생"@ko .

# 속성 정의
:enrolledIn a rdf:Property ;
    rdfs:domain :Student ;  # 주어는 Student
    rdfs:range :University ;  # 목적어는 University
    rdfs:label "등록하다"@ko .

OWL: 고급 온톨로지

# OWL은 더 강력한 표현력
:Person a owl:Class ;
    owl:equivalentClass [
        a owl:Class ;
        owl:unionOf ( :Male :Female )
    ] .  # Person = Male OR Female

:hasParent a owl:ObjectProperty ;
    owl:inverseOf :hasChild .  # 역관계

:hasSpouse a owl:SymmetricProperty .  # A spouse B ⟹ B spouse A

:hasAncestor a owl:TransitiveProperty .  # A → B → C ⟹ A → C

Layer 4: SPARQL (쿼리)

목적: RDF 데이터를 검색

# 기본 SELECT 쿼리
SELECT ?name ?age
WHERE {
    ?person a :Person ;
            :이름 ?name ;
            :나이 ?age .
    FILTER(?age >= 20)
}
ORDER BY DESC(?age)
LIMIT 10

복잡한 쿼리:

# 홍길동의 모든 친구와 친구의 친구
SELECT ?friend ?friendOfFriend
WHERE {
    :홍길동 :친구 ?friend .
    ?friend :친구 ?friendOfFriend .
    FILTER(?friendOfFriend != :홍길동)  # 자기 자신 제외
}

Layer 5: Rules (규칙)

SWRL (Semantic Web Rule Language):

# 규칙: 부모의 부모는 조부모
(?x :hasParent ?y) ∧ (?y :hasParent ?z)
    → (?x :hasGrandparent ?z)

# Turtle 문법으로:
[ a swrl:Imp ;
  swrl:body (
    [ a swrl:IndividualPropertyAtom ;
      swrl:propertyPredicate :hasParent ;
      swrl:argument1 ?x ;
      swrl:argument2 ?y ]
    [ a swrl:IndividualPropertyAtom ;
      swrl:propertyPredicate :hasParent ;
      swrl:argument1 ?y ;
      swrl:argument2 ?z ]
  ) ;
  swrl:head (
    [ a swrl:IndividualPropertyAtom ;
      swrl:propertyPredicate :hasGrandparent ;
      swrl:argument1 ?x ;
      swrl:argument2 ?z ]
  )
] .

SHACL (Shapes Constraint Language):

# 제약: 모든 Person은 반드시 이름을 가져야 함
:PersonShape a sh:NodeShape ;
    sh:targetClass :Person ;
    sh:property [
        sh:path :이름 ;
        sh:minCount 1 ;  # 최소 1개
        sh:maxCount 1 ;  # 최대 1개
        sh:datatype xsd:string ;
        sh:minLength 2 ;
        sh:maxLength 50
    ] .

Layer 6 & 7: Logic, Trust, Proof

논리 추론:

# 논리식: ∀x (Person(x) ∧ age(x) >= 65) → Senior(x)
:Person a owl:Class .
:Senior a owl:Class ;
    owl:equivalentClass [
        a owl:Class ;
        owl:intersectionOf (
            :Person
            [ a owl:Restriction ;
              owl:onProperty :age ;
              owl:someValuesFrom [
                  a rdfs:Datatype ;
                  owl:onDatatype xsd:integer ;
                  owl:withRestrictions ( [ xsd:minInclusive 65 ] )
              ]
            ]
        )
    ] .

신뢰 (Trust):

# 디지털 서명
:Statement_12345 a rdf:Statement ;
    rdf:subject :홍길동 ;
    rdf:predicate :나이 ;
    rdf:object 30 ;
    :signedBy :병원A ;
    :signature "A3F2B9E8..."^^xsd:hexBinary ;
    :timestamp "2025-01-15T10:30:00Z"^^xsd:dateTime .

7. 첫 번째 실습: 가족 온톨로지 만들기

시나리오

다음 가족 관계를 온톨로지로 표현하고, 추론 엔진이 자동으로 관계를 유추하도록 하겠습니다:

홍길동 (남, 55세) ← 결혼 → 김영희 (여, 52세)
  ↓
자녀: 홍민수 (남, 25세), 홍지영 (여, 23세)

김영희의 부모: 김철수 (남, 78세), 이순자 (여, 75세)

Step 1: 네임스페이스 정의

@prefix : <http://example.org/family#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

# 온톨로지 메타데이터
: a owl:Ontology ;
    rdfs:label "가족 온톨로지"@ko ;
    rdfs:comment "가족 관계를 표현하는 온톨로지"@ko ;
    owl:versionInfo "1.0" ;
    rdfs:seeAlso <http://example.org/family/docs> .

Step 2: 클래스 정의

# 사람
:Person a owl:Class ;
    rdfs:label "사람"@ko, "Person"@en ;
    rdfs:comment "인간 개체"@ko .

# 성별
:Male a owl:Class ;
    rdfs:subClassOf :Person ;
    rdfs:label "남성"@ko ;
    owl:disjointWith :Female .  # 남성과 여성은 서로 배타적

:Female a owl:Class ;
    rdfs:subClassOf :Person ;
    rdfs:label "여성"@ko ;
    owl:disjointWith :Male .

# 역할
:Parent a owl:Class ;
    rdfs:subClassOf :Person ;
    rdfs:label "부모"@ko .

:Father a owl:Class ;
    rdfs:subClassOf :Parent, :Male ;
    rdfs:label "아버지"@ko .

:Mother a owl:Class ;
    rdfs:subClassOf :Parent, :Female ;
    rdfs:label "어머니"@ko .

:Child a owl:Class ;
    rdfs:subClassOf :Person ;
    rdfs:label "자녀"@ko .

Step 3: 속성 (Property) 정의

# 기본 속성
:name a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:range xsd:string ;
    rdfs:label "이름"@ko .

:age a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:range xsd:integer ;
    rdfs:label "나이"@ko .

:gender a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range { :Male :Female } ;  # Male 또는 Female만 가능
    rdfs:label "성별"@ko .

# 관계 속성
:hasParent a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :Parent ;
    rdfs:label "부모가 있다"@ko .

:hasChild a owl:ObjectProperty ;
    rdfs:domain :Parent ;
    rdfs:range :Child ;
    owl:inverseOf :hasParent ;  # 역관계!
    rdfs:label "자녀가 있다"@ko .

:hasFather a owl:ObjectProperty ;
    rdfs:subPropertyOf :hasParent ;
    rdfs:range :Father ;
    rdfs:label "아버지가 있다"@ko .

:hasMother a owl:ObjectProperty ;
    rdfs:subPropertyOf :hasParent ;
    rdfs:range :Mother ;
    rdfs:label "어머니가 있다"@ko .

:hasSpouse a owl:ObjectProperty, owl:SymmetricProperty ;
    rdfs:domain :Person ;
    rdfs:range :Person ;
    rdfs:label "배우자가 있다"@ko .
    # Symmetric: A hasSpouse B ⟹ B hasSpouse A

:hasGrandparent a owl:ObjectProperty ;
    rdfs:label "조부모가 있다"@ko .

:hasGrandchild a owl:ObjectProperty ;
    owl:inverseOf :hasGrandparent ;
    rdfs:label "손자녀가 있다"@ko .

# 속성 체인 (Property Chain)
:hasGrandparent owl:propertyChainAxiom ( :hasParent :hasParent ) .
# 의미: 부모의 부모 = 조부모

Step 4: 개체 (Individual) 정의

# 조부모 세대
:김철수 a :Father ;
    :name "김철수"@ko ;
    :age 78 ;
    :gender :Male .

:이순자 a :Mother ;
    :name "이순자"@ko ;
    :age 75 ;
    :gender :Female .

:김철수 :hasSpouse :이순자 .

# 부모 세대
:홍길동 a :Father ;
    :name "홍길동"@ko ;
    :age 55 ;
    :gender :Male .

:김영희 a :Mother ;
    :name "김영희"@ko ;
    :age 52 ;
    :gender :Female ;
    :hasFather :김철수 ;
    :hasMother :이순자 .

:홍길동 :hasSpouse :김영희 .

# 자녀 세대
:홍민수 a :Male, :Child ;
    :name "홍민수"@ko ;
    :age 25 ;
    :gender :Male ;
    :hasFather :홍길동 ;
    :hasMother :김영희 .

:홍지영 a :Female, :Child ;
    :name "홍지영"@ko ;
    :age 23 ;
    :gender :Female ;
    :hasFather :홍길동 ;
    :hasMother :김영희 .

Step 5: 추론 결과

온톨로지 추론 엔진 (예: Apache Jena, OWL API)을 실행하면 다음이 자동으로 추론됩니다:

# 1. 역관계 추론 (inverseOf)
:홍길동 :hasChild :홍민수 .  # hasChild는 hasParent의 역관계
:홍길동 :hasChild :홍지영 .
:김영희 :hasChild :홍민수 .
:김영희 :hasChild :홍지영 .

# 2. 대칭 관계 추론 (SymmetricProperty)
:김영희 :hasSpouse :홍길동 .  # 이미 정의: 홍길동 hasSpouse 김영희
:이순자 :hasSpouse :김철수 .

# 3. 속성 체인 추론 (propertyChainAxiom)
:홍민수 :hasGrandparent :김철수 .  # 홍민수 → 김영희 → 김철수
:홍민수 :hasGrandparent :이순자 .
:홍지영 :hasGrandparent :김철수 .
:홍지영 :hasGrandparent :이순자 .

# 4. 역으로 손자녀 관계도 추론
:김철수 :hasGrandchild :홍민수 .
:김철수 :hasGrandchild :홍지영 .
:이순자 :hasGrandchild :홍민수 .
:이순자 :hasGrandchild :홍지영 .

# 5. 클래스 멤버십 추론
:홍길동 a :Parent .  # 이미 Father이고, Father ⊆ Parent
:김영희 a :Parent .

Step 6: SPARQL 쿼리로 검증

# 쿼리 1: 홍민수의 모든 조부모 찾기
SELECT ?grandparent ?name ?age
WHERE {
    :홍민수 :hasGrandparent ?grandparent .
    ?grandparent :name ?name ;
                 :age ?age .
}

# 결과:
# 김철수 | "김철수" | 78
# 이순자 | "이순자" | 75

# 쿼리 2: 김철수의 모든 손자녀 찾기
SELECT ?grandchild ?name ?age
WHERE {
    :김철수 :hasGrandchild ?grandchild .
    ?grandchild :name ?name ;
                :age ?age .
}
ORDER BY ?age

# 결과:
# 홍지영 | "홍지영" | 23
# 홍민수 | "홍민수" | 25

# 쿼리 3: 부부 찾기
SELECT ?person1 ?name1 ?person2 ?name2
WHERE {
    ?person1 :hasSpouse ?person2 ;
             :name ?name1 .
    ?person2 :name ?name2 .
    FILTER(STR(?person1) < STR(?person2))  # 중복 제거
}

# 결과:
# 김철수 | "김철수" | 이순자 | "이순자"
# 홍길동 | "홍길동" | 김영희 | "김영희"

Step 7: 제약 조건 추가 (SHACL)

# 모든 Person은 name과 age를 가져야 함
:PersonShape a sh:NodeShape ;
    sh:targetClass :Person ;
    sh:property [
        sh:path :name ;
        sh:minCount 1 ;
        sh:maxCount 1 ;
        sh:datatype xsd:string ;
        sh:message "모든 사람은 정확히 하나의 이름을 가져야 합니다."@ko
    ] ;
    sh:property [
        sh:path :age ;
        sh:minCount 1 ;
        sh:maxCount 1 ;
        sh:datatype xsd:integer ;
        sh:minInclusive 0 ;
        sh:maxInclusive 150 ;
        sh:message "나이는 0-150 사이여야 합니다."@ko
    ] .

# Child는 반드시 부모가 있어야 함
:ChildShape a sh:NodeShape ;
    sh:targetClass :Child ;
    sh:property [
        sh:path :hasParent ;
        sh:minCount 1 ;  # 최소 1명의 부모 필요
        sh:message "자녀는 최소 1명의 부모가 있어야 합니다."@ko
    ] .

요약

이 챕터에서 배운 내용

온톨로지의 기원
- 아리스토텔레스의 10범주 (기원전 384-322년)
- 2,300년의 지혜가 현대 컴퓨터 과학에 적용
Tom Gruber의 정의 (1993)
- "개념화의 명세"
- 공유, 합의, 형식적 표현
시맨틱 웹 비전
- Tim Berners-Lee의 꿈 (1994, 2001)
- Google Knowledge Graph: 500억+ 개체
- Schema.org: 1천만+ 웹사이트
- Linked Open Data: 150억+ 트리플
데이터베이스 vs. 온톨로지
- Closed World vs. Open World
- 고정 스키마 vs. 유연 스키마
- 추론 불가 vs. 자동 추론
실제 문제 해결
- 의료: SNOMED CT (Mayo Clinic $70M 절감)
- E-commerce: 상품 매칭 (Amazon 전환율 23% 증가)
- 금융: FIBO (Wells Fargo $135M/년 절감)
시맨틱 웹 스택
- URI/IRI → RDF → RDFS/OWL → SPARQL → Rules → Trust
- 각 계층이 하위 계층을 기반으로 쌓임
실습: 가족 온톨로지
- 클래스, 속성, 개체 정의
- 자동 추론 (역관계, 대칭, 속성 체인)
- SPARQL 쿼리로 검증

핵심 포인트

온톨로지는 단순한 데이터베이스가 아닙니다.

지식을 표현하고 공유하는 표준
기계가 이해하고 추론할 수 있는 구조
서로 다른 시스템을 연결하는 다리

시맨틱 웹은 이미 현실입니다.

Google 검색의 10억+ 쿼리
Wikipedia의 구조화 (DBpedia)
의료, 금융, E-commerce 등 모든 산업

다음 챕터 예고

Chapter 2에서는 **RDF (Resource Description Framework)**를 깊이 있게 다룹니다:

Triple의 구조
Turtle, JSON-LD, RDF/XML 문법
Blank Node와 Named Graph
RDF 데이터 생성 실습

다음 챕터: Chapter 2: RDF