파이썬 몬테카를로 포트폴리오 최적화

몬테카를로 시뮬레이션이란?

몬테카를로 시뮬레이션(Monte Carlo Simulation)은 무작위 샘플링을 반복하여 복잡한 시스템의 결과를 추정하는 통계 기법입니다. 퀀트 투자에서는 수만 가지 무작위 포트폴리오 조합을 생성하고, 각 조합의 수익률·변동성·샤프 비율을 계산하여 최적의 자산 배분을 찾는 데 활용합니다.

해리 마코위츠의 현대 포트폴리오 이론(MPT)에서 제시한 효율적 프론티어(Efficient Frontier)를 수치적으로 구하는 가장 직관적인 방법이며, 수학적 최적화가 어려운 복잡한 제약 조건에서도 유연하게 작동합니다. 실제로 헤지펀드와 자산운용사에서 포트폴리오 리스크 분석과 스트레스 테스트에 널리 사용하는 핵심 기법입니다.

핵심 개념: 효율적 프론티어와 샤프 비율

몬테카를로 포트폴리오 최적화를 이해하기 위한 핵심 개념을 정리합니다.

효율적 프론티어: 동일 위험 수준에서 최대 수익, 또는 동일 수익에서 최소 위험을 달성하는 포트폴리오 집합
샤프 비율 = (포트폴리오 수익률 − 무위험 수익률) / 포트폴리오 변동성
최소 분산 포트폴리오(MVP): 변동성이 가장 낮은 포트폴리오
최대 샤프 포트폴리오: 위험 대비 수익이 가장 높은 최적 포트폴리오
공분산 행렬: 자산 간 수익률의 상관관계를 나타내는 행렬로, 분산 효과의 핵심

파이썬 환경 설정과 데이터 수집

pip install yfinance pandas numpy matplotlib scipy

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize

# 다중 자산 데이터 수집
tickers = ['SPY', 'QQQ', 'TLT', 'GLD', 'VNQ', 'EFA']
names = ['미국대형주', '나스닥100', '미국장기채', '금', '리츠', '선진국주식']

data = yf.download(tickers, start="2018-01-01", end="2025-12-31")['Close']
returns = data.pct_change().dropna()

# 기본 통계
annual_returns = returns.mean() * 252
annual_vol = returns.std() * np.sqrt(252)

print("=== 개별 자산 연율화 통계 ===")
for i, t in enumerate(tickers):
    print(f"{names[i]:>8} ({t}): 수익률 {annual_returns[t]*100:>6.1f}%, "
          f"변동성 {annual_vol[t]*100:>6.1f}%, "
          f"샤프 {annual_returns[t]/annual_vol[t]:.2f}")

몬테카를로 시뮬레이션 핵심 구현

수만 개의 무작위 포트폴리오를 생성하여 효율적 프론티어를 시각적으로 탐색합니다.

def monte_carlo_portfolios(returns, num_portfolios=50000, risk_free=0.04):
    """
    몬테카를로 시뮬레이션으로 무작위 포트폴리오 생성
    - num_portfolios: 생성할 포트폴리오 수
    - risk_free: 무위험 수익률 (연율)
    """
    num_assets = len(returns.columns)
    mean_returns = returns.mean() * 252
    cov_matrix = returns.cov() * 252

    # 결과 저장
    results = np.zeros((num_portfolios, 3 + num_assets))

    for i in range(num_portfolios):
        # 무작위 가중치 생성 (합 = 1)
        weights = np.random.random(num_assets)
        weights /= weights.sum()

        # 포트폴리오 수익률과 변동성
        port_return = np.dot(weights, mean_returns)
        port_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
        sharpe = (port_return - risk_free) / port_vol

        results[i, 0] = port_return
        results[i, 1] = port_vol
        results[i, 2] = sharpe
        results[i, 3:] = weights

    columns = ['return', 'volatility', 'sharpe'] + list(returns.columns)
    df_results = pd.DataFrame(results, columns=columns)

    return df_results


# 시뮬레이션 실행
portfolios = monte_carlo_portfolios(returns, num_portfolios=50000)

# 최적 포트폴리오 찾기
max_sharpe_idx = portfolios['sharpe'].idxmax()
min_vol_idx = portfolios['volatility'].idxmin()

max_sharpe = portfolios.loc[max_sharpe_idx]
min_vol = portfolios.loc[min_vol_idx]

print("=== 최대 샤프 포트폴리오 ===")
print(f"수익률: {max_sharpe['return']*100:.1f}%")
print(f"변동성: {max_sharpe['volatility']*100:.1f}%")
print(f"샤프: {max_sharpe['sharpe']:.2f}")
for t in returns.columns:
    print(f"  {t}: {max_sharpe[t]*100:.1f}%")

print("n=== 최소 변동성 포트폴리오 ===")
print(f"수익률: {min_vol['return']*100:.1f}%")
print(f"변동성: {min_vol['volatility']*100:.1f}%")
for t in returns.columns:
    print(f"  {t}: {min_vol[t]*100:.1f}%")

효율적 프론티어 시각화

plt.figure(figsize=(14, 8))

# 전체 포트폴리오 산점도 (샤프 비율로 색상)
scatter = plt.scatter(
    portfolios['volatility'] * 100,
    portfolios['return'] * 100,
    c=portfolios['sharpe'],
    cmap='viridis', alpha=0.3, s=5
)
plt.colorbar(scatter, label='Sharpe Ratio')

# 최대 샤프 포트폴리오
plt.scatter(max_sharpe['volatility'] * 100, max_sharpe['return'] * 100,
           marker='*', color='red', s=500, zorder=5, label='Max Sharpe')

# 최소 변동성 포트폴리오
plt.scatter(min_vol['volatility'] * 100, min_vol['return'] * 100,
           marker='*', color='blue', s=500, zorder=5, label='Min Volatility')

# 개별 자산 표시
annual_returns = returns.mean() * 252
annual_vol = returns.std() * np.sqrt(252)
for t in returns.columns:
    plt.scatter(annual_vol[t] * 100, annual_returns[t] * 100,
               marker='D', s=100, zorder=5)
    plt.annotate(t, (annual_vol[t] * 100 + 0.3, annual_returns[t] * 100))

plt.title('Monte Carlo Efficient Frontier (50,000 Portfolios)', fontsize=14)
plt.xlabel('Annual Volatility (%)')
plt.ylabel('Annual Return (%)')
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('efficient_frontier.png', dpi=150)
plt.show()

수학적 최적화: scipy로 정확한 최적 포트폴리오 계산

몬테카를로는 근사치를 제공합니다. scipy.optimize를 활용하면 수학적으로 정확한 최적 포트폴리오를 구할 수 있습니다.

def optimize_portfolio(returns, risk_free=0.04):
    """scipy로 최적 포트폴리오 계산"""
    num_assets = len(returns.columns)
    mean_returns = returns.mean() * 252
    cov_matrix = returns.cov() * 252

    def neg_sharpe(weights):
        port_return = np.dot(weights, mean_returns)
        port_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
        return -(port_return - risk_free) / port_vol

    def min_variance(weights):
        return np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))

    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bounds = tuple((0, 1) for _ in range(num_assets))
    init = np.array([1/num_assets] * num_assets)

    # 최대 샤프 포트폴리오
    opt_sharpe = minimize(neg_sharpe, init, method='SLSQP',
                          bounds=bounds, constraints=constraints)

    # 최소 분산 포트폴리오
    opt_minvol = minimize(min_variance, init, method='SLSQP',
                          bounds=bounds, constraints=constraints)

    return {
        'max_sharpe': {
            'weights': opt_sharpe.x,
            'return': np.dot(opt_sharpe.x, mean_returns),
            'volatility': np.sqrt(np.dot(opt_sharpe.x.T,
                                  np.dot(cov_matrix, opt_sharpe.x))),
            'sharpe': -opt_sharpe.fun
        },
        'min_vol': {
            'weights': opt_minvol.x,
            'return': np.dot(opt_minvol.x, mean_returns),
            'volatility': opt_minvol.fun,
        }
    }


opt = optimize_portfolio(returns)

print("=== 수학적 최적화 결과 ===")
print(f"n최대 샤프 (Sharpe={opt['max_sharpe']['sharpe']:.2f}):")
for i, t in enumerate(returns.columns):
    w = opt['max_sharpe']['weights'][i]
    if w > 0.01:
        print(f"  {t}: {w*100:.1f}%")

print(f"n최소 변동성 (Vol={opt['min_vol']['volatility']*100:.1f}%):")
for i, t in enumerate(returns.columns):
    w = opt['min_vol']['weights'][i]
    if w > 0.01:
        print(f"  {t}: {w*100:.1f}%")

리스크 시뮬레이션: VaR과 최악 시나리오 분석

최적 포트폴리오가 극단적 시장 상황에서 얼마나 손실을 입을 수 있는지 몬테카를로로 시뮬레이션합니다.

def risk_simulation(returns, weights, num_simulations=10000,
                    holding_days=252, initial=100000):
    """
    몬테카를로 리스크 시뮬레이션
    - VaR (Value at Risk)과 CVaR 계산
    - 최악/최선 시나리오 분석
    """
    mean_returns = returns.mean().values
    cov_matrix = returns.cov().values

    # 포트폴리오 일별 수익률 시뮬레이션
    port_returns = np.random.multivariate_normal(
        mean_returns, cov_matrix,
        (num_simulations, holding_days)
    )

    # 가중 포트폴리오 수익률
    weighted_returns = port_returns @ weights

    # 누적 자산 가치
    cum_returns = initial * np.cumprod(1 + weighted_returns, axis=1)
    final_values = cum_returns[:, -1]

    # VaR 계산
    var_95 = np.percentile(final_values, 5)
    var_99 = np.percentile(final_values, 1)
    cvar_95 = final_values[final_values <= var_95].mean()

    print(f"=== 리스크 시뮬레이션 ({num_simulations:,}회) ===")
    print(f"초기 투자: ${initial:,.0f}")
    print(f"투자 기간: {holding_days}일")
    print(f"n기대 자산: ${final_values.mean():,.0f}")
    print(f"중앙값: ${np.median(final_values):,.0f}")
    print(f"최선 (95%ile): ${np.percentile(final_values, 95):,.0f}")
    print(f"최악 (5%ile): ${var_95:,.0f}")
    print(f"nVaR 95%: ${initial - var_95:,.0f} 손실")
    print(f"VaR 99%: ${initial - var_99:,.0f} 손실")
    print(f"CVaR 95%: ${initial - cvar_95:,.0f} 평균 손실")

    return final_values, cum_returns


# 최적 포트폴리오로 리스크 분석
weights = opt['max_sharpe']['weights']
final_vals, paths = risk_simulation(returns, weights)

리스크 시뮬레이션 시각화

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# 시뮬레이션 경로 (100개 샘플)
for i in range(100):
    axes[0].plot(paths[i], alpha=0.1, color='steelblue', linewidth=0.5)
axes[0].plot(np.median(paths, axis=0), color='red', linewidth=2, label='Median')
axes[0].plot(np.percentile(paths, 5, axis=0), color='orange',
            linewidth=2, linestyle='--', label='5th Percentile')
axes[0].axhline(y=100000, color='gray', linestyle='--', alpha=0.5)
axes[0].set_title('Portfolio Value Simulation Paths', fontsize=12)
axes[0].set_xlabel('Trading Days')
axes[0].set_ylabel('Portfolio Value ($)')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# 최종 자산 분포
axes[1].hist(final_vals, bins=100, color='steelblue', edgecolor='white', alpha=0.7)
axes[1].axvline(x=100000, color='gray', linestyle='--', label='Initial')
axes[1].axvline(x=np.percentile(final_vals, 5), color='red',
               linestyle='--', label='VaR 95%')
axes[1].axvline(x=np.mean(final_vals), color='green',
               linestyle='--', label='Mean')
axes[1].set_title('Final Portfolio Value Distribution', fontsize=12)
axes[1].set_xlabel('Portfolio Value ($)')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('risk_simulation.png', dpi=150)
plt.show()

제약 조건이 있는 포트폴리오 최적화

실전에서는 단순한 롱 온리(Long-only)를 넘어 다양한 제약 조건을 반영해야 합니다.

def constrained_optimization(returns, risk_free=0.04,
                              max_weight=0.40,
                              min_weight=0.05,
                              sector_limits=None):
    """
    제약 조건이 있는 포트폴리오 최적화
    - max_weight: 단일 자산 최대 비중
    - min_weight: 단일 자산 최소 비중
    - sector_limits: 섹터별 비중 제한
    """
    num_assets = len(returns.columns)
    mean_returns = returns.mean() * 252
    cov_matrix = returns.cov() * 252

    def neg_sharpe(weights):
        port_return = np.dot(weights, mean_returns)
        port_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
        return -(port_return - risk_free) / port_vol

    # 제약 조건
    constraints = [
        {'type': 'eq', 'fun': lambda x: np.sum(x) - 1}  # 비중 합 = 1
    ]

    # 섹터 제한 (예: 주식형 자산 합계 60% 이하)
    if sector_limits:
        for indices, limit in sector_limits:
            constraints.append({
                'type': 'ineq',
                'fun': lambda x, idx=indices, lim=limit: lim - sum(x[i] for i in idx)
            })

    # 개별 자산 비중 범위
    bounds = tuple((min_weight, max_weight) for _ in range(num_assets))
    init = np.array([1/num_assets] * num_assets)

    result = minimize(neg_sharpe, init, method='SLSQP',
                      bounds=bounds, constraints=constraints)

    weights = result.x
    port_return = np.dot(weights, mean_returns)
    port_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))

    print(f"=== 제약 조건 최적화 결과 ===")
    print(f"수익률: {port_return*100:.1f}%, 변동성: {port_vol*100:.1f}%")
    print(f"샤프: {(port_return - risk_free)/port_vol:.2f}")
    for i, t in enumerate(returns.columns):
        print(f"  {t}: {weights[i]*100:.1f}%")

    return weights


# 제약 조건 적용
# 주식형(SPY, QQQ, EFA) 합계 60% 이하
sector_limits = [([0, 1, 5], 0.60)]

constrained_weights = constrained_optimization(
    returns,
    max_weight=0.35,
    min_weight=0.05,
    sector_limits=sector_limits
)

워크포워드 검증: 과최적화 방지

과거 전체 데이터로 최적화한 결과가 미래에도 유효한지 워크포워드(Walk-Forward) 검증으로 확인합니다.

def walk_forward_test(returns, train_months=24, test_months=6):
    """
    워크포워드 검증
    - train_months: 최적화 기간 (월)
    - test_months: 평가 기간 (월)
    """
    results = []
    train_days = train_months * 21
    test_days = test_months * 21

    for start in range(0, len(returns) - train_days - test_days, test_days):
        train = returns.iloc[start:start + train_days]
        test = returns.iloc[start + train_days:start + train_days + test_days]

        if len(test) < test_days * 0.8:
            break

        # 훈련 데이터로 최적화
        opt = optimize_portfolio(train)
        weights = opt['max_sharpe']['weights']

        # 테스트 데이터로 성과 평가
        test_returns = test @ weights
        cum_return = (1 + test_returns).prod() - 1
        vol = test_returns.std() * np.sqrt(252)

        # 벤치마크 (균등 배분)
        equal_returns = test.mean(axis=1)
        bench_cum = (1 + equal_returns).prod() - 1

        results.append({
            'period': f"{train.index[-1].date()}",
            'strategy_return': cum_return * 100,
            'benchmark_return': bench_cum * 100,
            'volatility': vol * 100,
            'outperform': cum_return > bench_cum
        })

    results_df = pd.DataFrame(results)
    win_rate = results_df['outperform'].mean() * 100
    avg_excess = (results_df['strategy_return'] - results_df['benchmark_return']).mean()

    print(f"=== 워크포워드 검증 결과 ===")
    print(f"테스트 기간 수: {len(results_df)}")
    print(f"벤치마크 초과 확률: {win_rate:.0f}%")
    print(f"평균 초과 수익: {avg_excess:.1f}%p")
    print(results_df.to_string(index=False))

    return results_df


wf_results = walk_forward_test(returns, train_months=24, test_months=6)

실전 적용 시 핵심 고려사항

시뮬레이션 횟수: 최소 10,000회 이상 실행해야 안정적인 결과를 얻습니다. 자산 수가 많을수록 더 많은 시뮬레이션이 필요합니다.
정규분포 가정의 한계: 몬테카를로는 기본적으로 정규분포를 가정하지만, 실제 수익률은 팻테일(fat tail)을 보입니다. t-분포나 역사적 부트스트래핑을 활용하면 더 현실적인 결과를 얻을 수 있습니다.
거래 비용: 리밸런싱 시 발생하는 수수료와 세금을 반드시 반영하세요. 월 1회 또는 분기 1회 리밸런싱이 일반적입니다.
상관관계 불안정성: 위기 시 자산 간 상관관계가 급등하여 분산 효과가 약해집니다. 스트레스 테스트로 위기 시나리오를 반드시 점검하세요.
데이터 기간: 최소 5년 이상의 데이터를 사용하되, 너무 오래된 데이터는 현재 시장 구조를 반영하지 못할 수 있습니다.

자주 묻는 질문 (FAQ)

몬테카를로 시뮬레이션 횟수는 얼마가 적당한가요?

50,000~100,000회가 일반적입니다. 자산 수가 5개 이하면 10,000회로도 충분하지만, 10개 이상이면 100,000회 이상을 권장합니다. 시뮬레이션 수를 늘려도 결과가 크게 변하지 않는 시점이 적정 횟수입니다.

효율적 프론티어 위의 포트폴리오가 항상 최선인가요?

과거 데이터 기반이므로 미래 성과를 보장하지 않습니다. 효율적 프론티어는 과거 데이터에 대한 최적이며, 워크포워드 검증과 스트레스 테스트를 통해 견고성을 반드시 확인해야 합니다. 또한 추정 오류를 줄이기 위해 블랙-리터만 모델 같은 베이지안 접근을 병행하는 것이 좋습니다.

암호화폐 포트폴리오에도 적용할 수 있나요?

가능하지만 암호화폐의 극단적 변동성과 짧은 역사에 주의해야 합니다. 데이터 기간이 짧으면 추정 오류가 크므로, 보수적인 제약 조건(최대 비중 20% 등)과 함께 적용하고 자주 리밸런싱하는 것을 권장합니다.

결론: 몬테카를로 포트폴리오 최적화 로드맵

몬테카를로 시뮬레이션은 복잡한 포트폴리오 최적화를 직관적으로 해결하는 강력한 도구입니다. 다음 단계로 실전에 적용해 보세요.

기본 시뮬레이션 → 무작위 포트폴리오 생성과 효율적 프론티어 시각화
수학적 최적화 → scipy로 최대 샤프·최소 분산 포트폴리오 정확히 계산
리스크 분석 → VaR·CVaR로 최악 시나리오 정량화
제약 조건 → 실전 투자 규칙을 반영한 최적화
워크포워드 검증 → 과최적화 방지와 전략 견고성 확인

포트폴리오 최적화를 완성했다면, 켈리 기준 자금 관리를 적용하여 각 자산의 최적 레버리지를 결정하고, 팩터 투자 전략과 결합하여 팩터 기반 자산 선정 + 몬테카를로 배분 최적화의 체계적인 투자 시스템을 구축해 보세요.