π‘π‘œπ‘‘π‘’π‘π‘œπ‘œπ‘˜

2.SQL λ³Έλ¬Έ

SQL

2.SQL

seoa__ 2025. 1. 8. 20:55

  [ λͺ©μ°¨ ]

    DBMS

    :   DB에 μ ‘κ·Όν•˜κ³  관리할 수 μžˆλŠ” μ†Œν”„νŠΈμ›¨μ–΄

     

    DBMS ❓

    : λ°μ΄ν„°μ˜ μ €μž₯, 검색, μ—…λ°μ΄νŠΈ, μ‚­μ œλ₯Ό 효율적으둜 μˆ˜ν–‰ν•  수 μžˆλ„λ‘ 지원

     

    μ£Όμš” DBMS와 νŠΉμ§•

     

    κ΄€κ³„ν˜•μ΄ κ°€μž₯ 널리 μ‚¬μš©λ˜λŠ” 방식
    κ΄€κ³„ν˜• λ°μ΄ν„°λ² μ΄μŠ€λŠ” 데이터λ₯Ό "ν…Œμ΄λΈ”"μ΄λΌλŠ” λ‹¨μœ„λ‘œ 관리

     

    μ½”λ”©μˆœμ„œ

    SELECT → FROM → JOIN → ON → WHERE → GROUP BY → HAVING → ORDER BY → LIMIT

     

    μ‹€ν–‰μˆœμ„œ

    FROM → ON → JOIN → WHERE → GROUP BY → HAVING → SELECT → DISTINCT → ORDER BY → LIMIT

    πŸ“Œselectμ ˆμ—μ„œ μ„ μ–Έν•œ 별칭을 having μ ˆμ—μ„œ μ‚¬μš©ν•  수 μžˆλŠ”κ°€?πŸ“Œ
    A : Yes !!!!!!! whereμ ˆμ—μ„  No !!!!!!!

    SELECTμ ˆμ—μ„œ μ„ μ–Έν•΄μ€€ Aliasλ₯Ό ORDER BYλŠ” 물둠이고 GROUP BY, HAVINGμ—μ„œλ„ 이용

     

    μ—΄ (Column) ↓

    열은 ν…Œμ΄λΈ”μ— μ €μž₯된 λ°μ΄ν„°μ˜ 각 ν•­λͺ©(νŠΉμ„±)을 λ‚˜νƒ€λƒ„
    즉, 열은 λ°μ΄ν„°μ˜ 속성을 μ •μ˜ν•˜λ©°, 같은 μ—΄μ—λŠ” λ™μΌν•œ μœ ν˜•μ˜ 데이터가 μ €μž₯됨

     

    ν–‰(Row) →

    행은 ν…Œμ΄λΈ”μ—μ„œ κ°œλ³„ 데이터 ν•­λͺ©(λ ˆμ½”λ“œ)을 λ‚˜νƒ€λƒ„
    κ³ μœ ν•œ 데이터λ₯Ό λ‹΄κ³  있으며, 각 μ—΄μ˜ 데이터λ₯Ό μ‘°ν•©ν•˜μ—¬ ν•˜λ‚˜μ˜ μ™„μ „ν•œ 정보λ₯Ό λ‚˜νƒ€λƒ„

     

    데이터 νƒ€μž…

    λ¬Έμžν˜• : varchar (문자, 숫자, 특수문자 포함)
    μˆ˜μΉ˜ν˜• : int, float (μ •μˆ˜,μ‹€μˆ˜)
    λ‚ μ§œ/μ‹œκ°„ν˜• : date (yyyy-mm-dd)
    λΆˆλ¦¬μ–Έ : boolean (true/fales ν˜Ήμ€ 0/1)

     

    join = vlook up

    union : ν•©μΉ˜κ³  μ •λ ¬, μ€‘λ³΅μ œκ±° = μ„±λŠ₯이 λ‹€μ†Œ λ–¨μ–΄μ§ˆ μˆ˜λ„ μžˆμŒ (μ‹œκ°„ μ˜€λž˜ κ±Έλ¦Ό)
    union all : ν•©μΉ˜κΈ°λ§Œ ν•¨



    λ°˜ν™˜ 
    1. κ°’
        2. ν–‰
        3. ν…Œμ΄λΈ”
        
    의쑴 
    1. μƒκ΄€ : μ™ΈλΆ€μ»€λ¦¬λž‘ κ³„속 μ£Όκ³  λ°›μŒ 
    2. λΉ„상관 : λ”± ν•œ λ²ˆλ§Œ μ‹€ν–‰ (독립)
        

    μ„œλΈŒμΏΌλ¦¬λŠ” μ™ΈλΆ€ μΏΌλ¦¬λ³΄λ‹€ λ¨Όμ € μ‹€ν–‰λ˜λ©°, λ°˜ν™˜λœ κ²°κ³ΌλŠ” μ™ΈλΆ€ μΏΌλ¦¬μ—μ„œ μ‚¬μš©

     

    null

    null = 값이 μ—†λ‹€
    0 μ΄λž‘ λ‹€λ₯Έ κ°œλ…μ΄λ‹€
    null 은 'is null', 'is not null' μ‚¬μš©

     

    -- 일반적인 비ꡐ연산은 NULLμ—μ„œ μž‘λ™ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
    SELECT *
    FROM ν…Œμ΄λΈ”λͺ…
    WHERE 컬럼λͺ… = NULL; -- μž‘λ™ν•˜μ§€ μ•ŠμŒ!
    
    -- NULL κ°’λ§Œ 선택
    SELECT * 
    FROM ν…Œμ΄λΈ”λͺ…
    WHERE 컬럼λͺ… IS NULL;
    
    -- NULL이 μ•„λ‹Œ κ°’λ§Œ 선택
    SELECT * 
    FROM ν…Œμ΄λΈ”λͺ…
    WHERE 컬럼λͺ… IS NOT NULL;

     

    select : 쑰회
    from : ν…Œμ΄λΈ” μ§€μ •
    distinct : 쀑볡 제거

    SELECT DISTINCT city 
    FROM employees;

     

    limit : 갯수 μ œν•œ

    where : 쑰건 μ§€μ •
    between A and B : A 와 B 사이 쑰회

    -- λ‚˜μ΄κ°€ 30~50 사이인 직원 쑰회
    SELECT * 
    FROM employees 
    WHERE age BETWEEN 30 AND 50;

     

    in : μΌμΉ˜ν•˜λŠ” 데이터 필터링

    -- IN μ˜ˆμ‹œ: νŠΉμ • λ„μ‹œμ—μ„œ κ·Όλ¬΄ν•˜λŠ” 직원 쑰회
    SELECT * 
    FROM employees 
    WHERE city IN ('Seoul', 'Busan');

     

    not in : μ œμ™Έν•œ 데이터 쑰회

    -- NOT IN μ˜ˆμ‹œ: νŠΉμ • λ„μ‹œλ₯Ό μ œμ™Έν•œ 직원 쑰회
    SELECT * 
    FROM employees 
    WHERE city NOT IN ('Seoul', 'Busan');

     

    like

    νŠΉμ • νŒ¨ν„΄κ³Ό μΌμΉ˜ν•˜λŠ” 데이터 필터링 %, _ ν™œμš© (not like도 κ°€λŠ₯)
    μ‘°κ±΄λ¬Έμ—μ„œ μ‚¬μš©ν•  수 μžˆλŠ” μ—°μ‚°μž : <>, !=

     

    case when

    else end as

    : 첫번째둜 참인 쑰건이 μ‹€ν–‰ → 쑰건을 λ§Œμ‘±ν•˜μ§€ μ•ŠμœΌλ©΄ else 절이 μ‹€ν–‰

    SELECT name, age,
           CASE 
               WHEN age < 30 THEN 'μ²­λ…„'
               WHEN age BETWEEN 30 AND 50 THEN '쀑년'
               ELSE '노인'
           END AS age_group
    FROM employees;

    μ§‘κ³„ν•¨μˆ˜

    : 단일 결과값을 λ°˜ν™˜ν•˜λŠ” ν•¨μˆ˜
    λŒ€λΆ€λΆ„μ˜ 집계 ν•¨μˆ˜λŠ” null κ°’ μ œμ™Έν•˜κ³  μž‘λ™ν•¨
    μ˜ˆμ™Έμ μœΌλ‘œ count(*) λŠ” null포함 ‼️‼️‼️

    count : λ°μ΄ν„°μ˜ ν–‰ 수 λ°˜ν™˜
    count(distinct) : κ³ μœ ν•œ 값을 κ°–λŠ” λ°μ΄ν„°μ˜ ν–‰ 수 λ°˜ν™˜
    sum : μ—΄μ˜ 값을 λͺ¨λ‘ 더함
    avg : μ—΄ κ°’μ˜ 평균 계산
    min : μ΅œμ†Œκ°’
    max : μ΅œλŒ€κ°’

    group by : κ·Έλ£Ήν™”ν•˜μ—¬ μš”μ•½ 정보 생성
    having : group by둜 κ·Έλ£Ήν™”λœ 데이터에 쑰건을 μΆ”κ°€λ‘œ μ μš©ν•  λ•Œ μ‚¬μš©

    • where + group by
    -- IT λΆ€μ„œκ°€ μ•„λ‹Œ μ§μ›λ“€λ§Œ μ„ νƒν•˜μ—¬ λΆ€μ„œλ³„ 총 κΈ‰μ—¬λ₯Ό 계산
    SELECT department_id, SUM(salary) AS total_salary
    FROM employees
    WHERE department_id != 'IT'
    GROUP BY department_id;
    
    --WHERE department_id != 'IT'둜 데이터λ₯Ό ν•„ν„°λ§ν•©λ‹ˆλ‹€.
    --ν•„ν„°λ§λœ 데이터λ₯Ό GROUP BY department_id둜 κ·Έλ£Ήν™”ν•œ ν›„, 총 κΈ‰μ—¬λ₯Ό 계산
    • group by + having
    -- 예제 μ½”λ“œ1 
    -- λΆ€μ„œλ³„ 총 κΈ‰μ—¬κ°€ 10,000 이상인 λΆ€μ„œλ§Œ 선택
    SELECT department_id, SUM(salary) AS total_salary
    FROM employees
    GROUP BY department_id
    HAVING SUM(salary) >= 10000;
    -- λͺ¨λ“  데이터λ₯Ό GROUP BY department_id둜 κ·Έλ£Ήν™”ν•©λ‹ˆλ‹€.
    -- HAVING 쑰건을 μ μš©ν•˜μ—¬ 총 κΈ‰μ—¬κ°€ 10,000 이상인 λΆ€μ„œλ§Œ 선택!
    -- SELECT μ ˆμ—μ„œ 두 컬럼 쑰회 

    where + group by + having

    -- 예제 μ½”λ“œ1
    -- λ‚˜μ΄κ°€ 30 이상인 μ§μ›λ§Œ μ„ νƒν•˜κ³ (WHERE), 성별별 κ·Έλ£Ήν™”κ°€ 된 ν›„(GROUP BY)
    -- 평균 λ‚˜μ΄κ°€ 35 이상인 μ„±λ³„λ§Œ 쑰회(HAVING)
    SELECT gender,
           AVG(age) AS average_age
    FROM employees
    WHERE age >= 30
    GROUP BY gender
    HAVING AVG(age) >= 35;
    -- 예제 μ½”λ“œ2
    -- IT λΆ€μ„œμ˜ μ§μ›λ§Œ μ„ νƒν•˜κ³ , λΆ€μ„œλ³„ 총 κΈ‰μ—¬κ°€ 10,000 이상인 경우만 ν‘œμ‹œ
    SELECT department_id, SUM(salary) AS total_salary
    FROM employees
    WHERE department_id IN ('IT', 'HR')
    GROUP BY department_id
    HAVING SUM(salary) >= 10000;
    
    --WHERE department_id IN ('IT', 'HR')둜 IT와 HR λΆ€μ„œμ˜ μ§μ›λ§Œ 필터링
    --ν•„ν„°λ§λœ 데이터λ₯Ό GROUP BY department_id둜 κ·Έλ£Ήν™”
    --κ·Έλ£Ήν™”λœ κ²°κ³Όμ—μ„œ SUM(salary) >= 10000 쑰건을 λ§Œμ‘±ν•˜λŠ” λΆ€μ„œλ§Œ  쑰회

     

    πŸ“Œwhere 와 having μ°¨μ΄μ πŸ“Œ
    where

    • κ·Έλ£Ήν™” 이전에 데이터 필터링
    • κ°œλ³„ 행을 κΈ°μ€€μœΌλ‘œ 쑰건 적용
      having
      -κ·Έλ£Ήν™” 이후에 데이터λ₯Ό 필터링, 즉 κ·Έλ£Ήν™”λœ κ²°κ³Όλ₯Ό κΈ°μ€€μœΌλ‘œ 쑰건을 적용
      차이 μš”μ•½λ°μ΄ν„° → where (κ°œλ³„ ν–‰ 필터링) → group by (κ·Έλ£Ήν™”) → having (κ·Έλ£Ήν™”λœ κ²°κ³Ό 필터링)

    order by

    : μ •λ ¬
    asc(μƒλž΅κ°€λŠ₯)

    : μ˜€λ¦„μ°¨μˆœ
    desc

    : λ‚΄λ¦Όμ°¨μˆœ

     

    μœˆλ„μš° ν•¨μˆ˜

    : over() 절과 ν•¨κ»˜ μ‚¬μš©, 행을 κ·Έλ£Ήν™”, μ •λ ¬, λ²”μœ„ μ„€μ •ν•˜μ—¬ λ‹€μ–‘ν•œ ν•¨μˆ˜λ“€μ„ 제곡

     

    μœˆλ„μš° ν•¨μˆ˜ κΈ°λ³Έ 문법
    select μœˆλ„μš°ν•¨μˆ˜(컬럼1) over (
    [partition by 컬럼2] -- κ·Έλ£Ήν™”
        [order by 컬럼3 asc/desc] --μ •λ ¬
        [rows/range between a and b] --계산 λ²”μœ„
        ) as κ²°κ³Ό
        from ν…Œμ΄λΈ”;



    μœˆλ„μš° ν•¨μˆ˜μ˜ μ§„ν–‰ μˆœμ„œ

     

    1. λ°μ΄ν„°λ₯Ό νŒŒν‹°μ…˜μœΌλ‘œ λ‚˜λˆ” (κ·Έλ£Ήν™”)
    2. κ° κ·Έλ£Ήλ³„ λ°μ΄ν„° μ •λ ¬
    3. μ •λ ¬λœ λ°μ΄ν„° λ‚΄μ—μ„œ μ—°μ‚°λ²”μœ„ μ„€μ •
    4. μ§€μ •λœ λ²”μœ„μ™€ μ •렬에 λ”°λΌ μœˆλ„μš°ν•¨μˆ˜ κ°’ κ³„μ‚°
                           

    μœˆλ„μš° ν•¨μˆ˜ νŠΉμ§•

     

    집계 ν•¨μˆ˜μ˜ ν™•μž₯ : sum,avg λ“±μ˜ μ§‘계 κ²°κ³Όλ₯Ό κ°œλ³„ ν–‰μ— ν‘œμ‹œ
    1.κΈ°μ‘΄ μ§‘계 ν•¨μˆ˜λŠ” κ·Έλ£Ή λ‹¨μœ„λ‘œ ν•¨μˆ˜κ°€ μ μš©λ˜μ§€λ§Œ, μœˆλ„μš° ν•¨μˆ˜λŠ” **~~각 ν–‰μ˜ λ°μ΄ν„°μ™€ ν•¨κ»˜~~** μ§‘계 κ²°κ³Ό μœ μ§€
    2. μ›λ³Έ λ°μ΄ν„° μœ μ§€ : μ›λ³Έ λ°μ΄ν„°μ™€ ν•¨κ»˜ μœˆλ„μš° ν•¨μˆ˜μ˜ κ²°κ³Όλ₯Ό λ™μ‹œμ— ν™•인 κ°€λŠ₯
    3. λ‹€μ–‘ν•œ κΈ°λŠ₯의 ν•¨μˆ˜ μ œκ³΅ : μˆœμœ„ λ§€κΈ°κΈ°, λˆ„적합 κ³„μ‚°, νŠΉμ • ν–‰ κ°’ μ°Έμ‘°, λ°±λΆ„μœ„ κ³„μ‚° λ“±λ“±

     

    μœˆλ„μš° ν•¨μˆ˜κ°€ μΆ”κ°€λœλ‹€λ©΄ ?
    from → on → join → where → group by → having → μœˆλ„μš° → select → distinct→ order by → limit

     


    μ’…λ₯˜ 

    μˆœμœ„ ν•¨μˆ˜ → rank(), dense_rank(), row nember()
    집계 ν•¨μˆ˜ → sum() -κ·Έλ£Ήν™”λœ 데이터(νŒŒν‹°μ…˜ λ‚΄)에 λŒ€ν•΄ λˆ„μ ν•© 계산 (β–³μƒκ°ν•˜κΈ°!)
    ν–‰ μ°Έμ‘° ν•¨μˆ˜ → laf(),lead()
    λΉ„μœ¨ ν•¨μˆ˜ : percent_rank()
          
    μœˆλ„μš° ν•¨μˆ˜ λ²”μœ„ μ§€μ • 
    raws : λ°μ΄ν„° ν–‰ μˆœμ„œ κΈ°μ€€ (μ•žλ’€ ν•œ ν–‰μ”©)
    range : λ°μ΄ν„° κ°’ κΈ°μ€€ (κ°’ +, - μΈ λͺ¨λ“  ν–‰)

     

    λ²”μœ„ μ˜΅μ…˜

    'SQL' μΉ΄ν…Œκ³ λ¦¬μ˜ λ‹€λ₯Έ κΈ€

    1. 데이터 뢄석에 λŒ€ν•˜μ—¬  (1) 2025.01.08