[pandas] 데이터 추출(pandas) 기본

CS/AI

[pandas] 데이터 추출(pandas) 기본

뱅타 2023. 3. 27. 10:48

4주차 강의 중 알려준 pandas의 기능들을 작성해 보았습니다. (좀 더 빨리 익숙해지기 위해 글로 정리해 보았습니다.)

익숙하지 않아서 그런지 따로 공부를 하지 않으면 많이 헷갈리더군요.

haed

df.head(5)

	account	name	street	city	state	postal-code	Jan	Feb	Mar
0	211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
1	320563	Walter-Trantow	1311 Alvis Tunnel	Port Khadijah	NorthCarolina	38365	95000	45000	35000

...

Tranverse

// 행과 열을 바꿔서 출력
df.head(3).T

	0	1	2
account	211829	320563	648336
name	Kerluke, Koepp and Hilpert	Walter-Trantow	Bashirian, Kunde and Price
street	34456 Sean Highway	1311 Alvis Tunnel	62184 Schamberger Underpass Apt. 231

...

slicing

// row를 0,1,2 총 3개를 잘라서 출력해 줍니다.
// df.head(3) 과 동일한 기능을 합니다.
df[:3]

	0	1	2
account	211829	320563	648336
name	Kerluke, Koepp and Hilpert	Walter-Trantow	Bashirian, Kunde and Price
street	34456 Sean Highway	1311 Alvis Tunnel	62184 Schamberger Underpass Apt. 231

df[["name","street"]][:2]

0	Kerluke, Koepp and Hilpert	34456 Sean Highway
1	Walter-Trantow	1311 Alvis Tunnel

index

df.index = df["account"]
df

	account	name	street	city	state	postal-code	Jan	Feb	Mar
account
211829	211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
320563	320563	Walter-Trantow	1311 Alvis Tunnel	Port Khadijah	NorthCarolina	38365	95000	45000	35000

delete

del df["account"]
df.head()

name	street	city	state	postal-code	Jan	Feb	Mar
account
211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
320563	Walter-Trantow	1311 Alvis Tunnel	Port Khadijah	NorthCarolina	38365	95000	45000	35000

location

df.loc[[211829,320563],["name","street"]]

211829	Kerluke, Koepp and Hilpert	34456 Sean Highway
320563	Walter-Trantow	1311 Alvis Tunnel

df.loc[205217:,["name","street"]]

	name	street
account
205217	Kovacek-Johnston	91971 Cronin Vista Suite 601
209744	Champlin-Morar	26739 Grant Lock

...

iloc

// index location
// [row, col]
df.iloc[:10, :3]

	name	street	city
account
211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob
320563	Walter-Trantow	1311 Alvis Tunnel	Port Khadijah
648336	Bashirian, Kunde and Price	62184 Schamberger Underpass Apt. 231	New Lilianland

reset

df_new = df.reset_index()
df_new

	account	name	street	city	state	postal-code	Jan	Feb	Mar
0	211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
1	320563	Walter-Trantow	1311 Alvis Tunnel	Port Khadijah	NorthCarolina	38365	95000	45000	35000

drop

// 원본 객체에 영향을 주지 않는다.
df_new.drop(1).head()

	account	name	street	city	state	postal-code	Jan	Feb	Mar
0	211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
2	648336	Bashirian, Kunde and Price	62184 Schamberger Underpass Apt. 231	New Lilianland	Iowa	76517	91000	120000	35000

inplace

// drop 결과를 계속 반영하고 싶을 경우

// 1. 변수 저장 
df_drop = df_new.drop(1)
df_drop

// 2. inplace 사용
df_new.drop(1, inplace=True)
df_new

	account	name	street	city	state	postal-code	Jan	Feb	Mar
0	211829	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
2	648336	Bashirian, Kunde and Price	62184 Schamberger Underpass Apt. 231	New Lilianland	Iowa	76517	91000	120000	35000

axis

// axis=0: row
// axis=1: col

// "account" col drop
df_new.drop("account", inplace=True, axis=1)
df_new

	name	street	city	state	postal-code	Jan	Feb	Mar
0	Kerluke, Koepp and Hilpert	34456 Sean Highway	New Jaycob	Texas	28752	10000	62000	35000
2	Bashirian, Kunde and Price	62184 Schamberger Underpass Apt. 231	New Lilianland	Iowa	76517	91000	120000	35000

// index 0, 2 row drop
df_new.drop([0, 2], axis=0)

	account	name	street	city	state	postal-code	Jan	Feb	Mar
3	109996	D'Amore, Gleichner and Bode	155 Fadel Crescent Apt. 144	Hyattburgh	Maine	46021	45000	120000	10000
4	121213	Bauch-Goldner	7274 Marissa Common	Shanahanchester	California	49681	162000	120000	35000

References

https://hogni.tistory.com/49

728x90

저작자표시 비영리 변경금지 (새창열림)

'CS > AI' 카테고리의 다른 글

[AI] 모비율 (0)	2023.06.12
[AI] 모평균 (0)	2023.06.07
[AI] 확률 (0)	2023.05.08
[AI]선형변환(linear_transformation) (0)	2023.04.11
[AI] LU 분해(scipy.linalg.lu()) (0)	2023.03.30

현재글[pandas] 데이터 추출(pandas) 기본

Life Refactoring🔥

Today :
Yesterday :

250x250

Command, JPA, java, mariaDB, 주식, 상실에서성장으로, error, Mac, windows, Github, 매매일지, 뱅타, maven, inflearn, javascript, centos, IntelliJ, Eclipse, HTTP, git,

Bangta's quantum

[pandas] 데이터 추출(pandas) 기본

haed

Tranverse

slicing

index

delete

location

iloc

reset

drop

inplace

axis

References

'CS > AI' 카테고리의 다른 글

'CS/AI'의 다른글

티스토리툴바

« 2026/08 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

[pandas] 데이터 추출(pandas) 기본

haed

Tranverse

slicing

index

delete

location

iloc

reset

drop

inplace

axis

References

'CS > AI' 카테고리의 다른 글

'CS/AI'의 다른글

관련글

티스토리툴바