- Published on
π©βπ» | λ°λ₯λΆν° λμ νλ λͺ¨μ μμ±: AutoencoderνΈ
μ΄λ² ν¬μ€ν μμ μ°Έκ³ ν΄μ λͺ¨λΈμ λ§λ λ Όλ¬Έμ μλμ κ°λ€.
Data Acquisition & Preprocessing
λ³Έ λ Όλ¬Έμμ λ°μ΄ν°λ λͺ¨λ CMU mocap dataλ₯Ό μ¬μ©νμλ€. μ λ°μ μΈ μ μ²λ¦¬ κ³Όμ μ
- 120(cmu κΈ°λ³Έ fps) -> 30fpsλ‘ sub-sample
- 80 frameμ κ²Ήμ³μ 160 frameμ© μλΌμ λ°μ΄ν°λ₯Ό λ§λ¦
- joint length μ κ·ν
- jointλ€μ μ€μν 20κ°λ‘ μ€μ¬μ μ¬μ©
- globalμμ local positionμΌλ‘ transform ν΄μ μ¬μ©
- local position = Global position - Root XZ position
- YRotationλ μ κ±°
- YμΆ κΈ°μ€ κ°μλλ μΊλ¦ν°μ μ λ°©ν₯ κΈ°μ€μΌλ‘ κ³μ°
- 60κ°(= 20*3)μ μ 보 λ€μλ Root XZ μλμ Yκ°μλ μ 보λ₯Ό μΆκ°λ‘ 63κ°μ μ λ³΄κ° λ€μ΄κ°λ€λ€
λ°λΌμ λ°μ΄ν°μ ννλ μ κ°λ€.
κ° λ°μ΄ν° λ³λ‘ λΌμ κΈΈμ΄κ° μ κ°κΈ°λΌ μ κ·ν κ³Όμ μ κ±°μ³€λλ°, μ΄ λΆλΆμ μ μμ μ¬μ΄νΈμμ ꡬν μ μλ€.
κ·Έλμ μ κ³Όμ μμ 3λ²μ μλ΅ν μ μμλ€.
μ¬μ©ν μ μ²λ¦¬μ© λΌμ΄λΈλ¬λ¦¬λ‘λ pymoλ₯Ό μ¬μ©νλ€.
μνκΉκ²λ μμ μμκ° κ΅μ₯ν μ€λ λ°©μΉλμ΄μ μ¬λλ€μ΄ forkν΄μ μ΄ λ²μ λ€μ΄ λ§λ€. λ³ΈμΈμ μλ simon λ²μ μμ μ체μ μΌλ‘ μ μ²λ¦¬μ λ§κ² νμ΄νλΌμΈμ λ§λ€μλ€. μλμ κ°μ΄ μ μ²λ¦¬λ₯Ό ν μ μμλ€..!
parser = BVHParser()
parsed_data = parser.parse(filepath)
data_pipe = Pipeline(
[
("param", MocapParameterizer("position")),
("jtsel", JointSelector(JOINTS, include_root=False)),
("dwnsampl", DownSampler(tgt_fps=30, keep_all=False)),
("globrm", GlobalMotionRemover()), # Custom template
("np", Numpyfier()),
]
)
piped_data = data_pipe.fit_transform([parsed_data])
slicer = Slicer(window_size=160, overlap=0.5)
piped_data = slicer.fit_transform(piped_data)
μ΅μ’
μ μΌλ‘λ (9721, 160, 63)
μ κ°μ shapeμ κ°μ§λ€.
Convolutional Autoencoder
kerasλ‘ λ§λ€μκ³ λͺ¨λΈμ summaryλ μλμ κ°λ€. μ무λλ μλ‘μ΄ λͺ¨μ μ체 μμ±μ΄λΌκΈ° 보λ€λ manifold μμ±μ΄ ν¬μ»€μ€λΌκ³ 보면 λκ² λ€. inputκ³Ό output λμΌνκ² μ½ 5μ΄ κ°λμ window μ λ³΄κ° λ€μ΄κ°κ³ λμ¨λ€.
ββββββββββββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββ³ββββββββββββββββββ
β Layer (type) β Output Shape β Param # β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β input_layer (InputLayer) β (None, 160, 63) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β conv1d (Conv1D) β (None, 160, 64) β 60,544 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β max_pooling1d (MaxPooling1D) β (None, 80, 64) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β activation (Activation) β (None, 80, 64) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β conv1d_1 (Conv1D) β (None, 80, 128) β 123,008 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β max_pooling1d_1 (MaxPooling1D) β (None, 40, 128) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β activation_1 (Activation) β (None, 40, 128) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β conv1d_2 (Conv1D) β (None, 40, 256) β 491,776 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β max_pooling1d_2 (MaxPooling1D) β (None, 20, 256) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β activation_2 (Activation) β (None, 20, 256) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β functional_1 (Functional) β (None, 160, 63) β 675,135 β
ββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββ΄ββββββββββββββββββ
Total params: 1,350,463 (5.15 MB)
Trainable params: 1,350,463 (5.15 MB)
Non-trainable params: 0 (0.00 B)
Train
λ Όλ¬Έμ μ ν νμ΄νλΌλ―Έν° λͺ κ°μ§λ‘ νλ ¨νμ λ νλ ¨μ΄ μ§νμ΄ μλκ³ ν°μ§λ λ¬Έμ κ° μμλ€.
Learning decayμμ μ΄κΉκ°μ 0.5μμ 0.001λ‘, decay κ°μ 0.9 μμ 0.95λ‘ μμ νμλ€. κ·Έλ¦¬κ³ νλ ¨νκΈ° μ μ z-score normalizationμΌλ‘ μ κ·νλ₯Ό κ±°μΉκ³ νλ ¨μ νμλ€. κ·Έ μΈλ λ Όλ¬Έκ³Ό μ΅λν λμΌνκ² κ΅¬ννλ €κ³ λ Έλ ₯νλ€. GeForce RTX 5060μμ νλ ¨ μκ°μ μ½ 6λΆμ΄ μμλμμΌλ©° fitting μκ°μ 3λΆμ΄ μμλμλ€. λ Όλ¬Έμμ μ ν 5μκ°μ νλ ¨ μκ°μ μκ°νλ©΄ νλμ¨μ΄μ λ°μ μ΄ κ·Ό 10λ λμ λ§μ΄ μμλ€λ κ²λ 체κ°ν μ μμλ€..
Learning rate | Total Loss |
---|---|
![]() | ![]() |
Result
Manifoldκ° νμ΅λμ΄μ μ£Όμ΄μ§ λ°μ΄ν°κ° μ£Όμ΄μ‘μ λ μ μννλ λͺ¨μ΅μ μλμ κ°μ΄ μ μ μλ€
Sampled Index 1 | Sampled Index 2 |
---|---|
![]() | ![]() |
Code
μ½λλ μ΅λν μ 리ν΄μ μλ 리ν¬μ§ν°λ¦¬μ λ¨κ²¨λμλ€.
- Authors
- Name
- Amelia Young
- GitHub
- @ameliacode