案例:記憶體優化
案例:記憶體優化
步驟 1:使用
基礎練習:比較有無
本案例基於 .claude/lib/config_loader.py 的實際程式碼,展示如何用 __slots__ 和 weakref 優化記憶體使用。
先備知識
問題背景
現有設計
config_loader.py 使用全域字典作為快取,這是一個常見的設計模式:
1# Global cache variables
2_agents_config_cache: Optional[dict] = None
3_quality_rules_cache: Optional[dict] = None
4
5def load_agents_config() -> dict:
6 """
7 載入代理人配置
8
9 使用模組層級變數作為快取,避免重複讀取檔案。
10 """
11 global _agents_config_cache
12 if _agents_config_cache is None:
13 try:
14 _agents_config_cache = load_config("agents")
15 except FileNotFoundError:
16 _agents_config_cache = _get_default_agents_config()
17 return _agents_config_cache
18
19def clear_config_cache() -> None:
20 """清除配置快取(用於測試或配置熱更新)"""
21 global _agents_config_cache, _quality_rules_cache
22 _agents_config_cache = None
23 _quality_rules_cache = None這種設計簡單直觀,但當系統需要快取更複雜的物件時,會遇到記憶體問題。
記憶體問題
當快取大量物件時:
- Python 字典有額外開銷:每個字典需要維護 hash table、keys、values
- 物件的
__dict__佔用記憶體:每個實例都有自己的屬性字典 - 快取可能導致記憶體洩漏:強引用阻止物件被回收
讓我們用一個更複雜的快取場景來說明問題:
1import sys
2
3class ConfigItem:
4 """配置項目,模擬複雜的快取物件"""
5 def __init__(self, key: str, value: str, metadata: dict):
6 self.key = key
7 self.value = value
8 self.metadata = metadata
9 self.access_count = 0
10 self.last_accessed = None
11
12# Create a config item and measure memory
13item = ConfigItem("database.host", "localhost", {"type": "string"})
14
15# Object size
16print(f"ConfigItem size: {sys.getsizeof(item)} bytes")
17# ConfigItem size: 48 bytes
18
19# But the real cost is in __dict__
20print(f"__dict__ size: {sys.getsizeof(item.__dict__)} bytes")
21# __dict__ size: 184 bytes當快取數萬個這樣的物件時,記憶體開銷會非常可觀。
進階解決方案
優化目標
- 減少每個物件的記憶體佔用
- 避免快取導致的記憶體洩漏
- 保持 API 不變
實作步驟
步驟 1:使用 __slots__ 減少物件大小
__slots__ 告訴 Python 這個類別只會有哪些屬性,讓直譯器可以用更緊湊的方式儲存資料:
1import sys
2
3class ConfigItemWithoutSlots:
4 """標準類別,使用 __dict__"""
5 def __init__(self, key: str, value: str, metadata: dict):
6 self.key = key
7 self.value = value
8 self.metadata = metadata
9 self.access_count = 0
10 self.last_accessed = None
11
12class ConfigItemWithSlots:
13 """使用 __slots__ 優化記憶體"""
14 __slots__ = ['key', 'value', 'metadata', 'access_count', 'last_accessed']
15
16 def __init__(self, key: str, value: str, metadata: dict):
17 self.key = key
18 self.value = value
19 self.metadata = metadata
20 self.access_count = 0
21 self.last_accessed = None
22
23# Compare memory usage
24item_without = ConfigItemWithoutSlots("db.host", "localhost", {"type": "str"})
25item_with = ConfigItemWithSlots("db.host", "localhost", {"type": "str"})
26
27print(f"Without __slots__: {sys.getsizeof(item_without)} bytes")
28print(f"With __slots__: {sys.getsizeof(item_with)} bytes")
29
30# The real difference is __dict__
31print(f"__dict__ overhead: {sys.getsizeof(item_without.__dict__)} bytes")
32# item_with has no __dict__!
33try:
34 item_with.__dict__
35except AttributeError as e:
36 print(f"No __dict__: {e}")記憶體結構比較
1沒有 __slots__:
2┌──────────────────────────────────┐
3│ PyObject header (16 B) │
4│ __dict__ 指標 (8 B) │
5│ __weakref__ 指標 (8 B) │
6│ │
7│ __dict__ (separate object): │
8│ - hash table (64 B) │
9│ - keys array (40 B) │
10│ - values array (40 B) │
11│ - key strings (~80 B) │
12│ │
13│ Total: ~256 bytes per object │
14└──────────────────────────────────┘
15
16有 __slots__:
17┌──────────────────────────────────┐
18│ PyObject header (16 B) │
19│ key slot (8 B) │
20│ value slot (8 B) │
21│ metadata slot (8 B) │
22│ access_count slot (8 B) │
23│ last_accessed slot (8 B) │
24│ │
25│ Total: ~56 bytes per object │
26└──────────────────────────────────┘大量物件的記憶體節省
1import sys
2import tracemalloc
3
4def measure_memory(cls, count=10000):
5 """Measure memory for creating multiple objects"""
6 tracemalloc.start()
7
8 objects = [
9 cls(f"key_{i}", f"value_{i}", {"index": i})
10 for i in range(count)
11 ]
12
13 current, peak = tracemalloc.get_traced_memory()
14 tracemalloc.stop()
15
16 return current, peak, objects
17
18# Measure both classes
19mem_without, peak_without, _ = measure_memory(ConfigItemWithoutSlots)
20mem_with, peak_with, _ = measure_memory(ConfigItemWithSlots)
21
22print(f"Without __slots__: {mem_without / 1024 / 1024:.2f} MB")
23print(f"With __slots__: {mem_with / 1024 / 1024:.2f} MB")
24print(f"Savings: {(mem_without - mem_with) / 1024 / 1024:.2f} MB")
25print(f"Ratio: {mem_without / mem_with:.1f}x")
26
27# Typical output:
28# Without __slots__: 3.82 MB
29# With __slots__: 1.15 MB
30# Savings: 2.67 MB
31# Ratio: 3.3x步驟 2:使用 weakref 避免強引用
weakref 讓我們可以引用物件,但不阻止它被垃圾回收:
1import weakref
2
3class CacheableConfig:
4 """可以被弱引用的配置物件"""
5 __slots__ = ['key', 'value', '_data', '__weakref__'] # Note: __weakref__ slot
6
7 def __init__(self, key: str, value: str):
8 self.key = key
9 self.value = value
10 self._data = None
11
12 def __repr__(self):
13 return f"CacheableConfig({self.key!r}, {self.value!r})"
14
15# Create object and weak reference
16config = CacheableConfig("app.name", "MyApp")
17weak_ref = weakref.ref(config)
18
19print(f"Object exists: {weak_ref()}")
20# Object exists: CacheableConfig('app.name', 'MyApp')
21
22# Delete the strong reference
23del config
24
25print(f"After del: {weak_ref()}")
26# After del: None使用 callback 追蹤物件回收
1import weakref
2
3def on_finalize(ref):
4 """Callback when object is garbage collected"""
5 print(f"Object was garbage collected!")
6
7config = CacheableConfig("db.port", "5432")
8weak_ref = weakref.ref(config, on_finalize)
9
10print("Deleting object...")
11del config
12# Output: Object was garbage collected!步驟 3:使用 WeakValueDictionary
WeakValueDictionary 是實作自動清理快取的利器:
1import weakref
2from typing import Callable, TypeVar, Generic
3
4T = TypeVar('T')
5
6class WeakCache(Generic[T]):
7 """
8 Auto-cleaning cache using weak references.
9
10 Objects are automatically removed from cache when no strong
11 references exist outside the cache.
12 """
13
14 def __init__(self):
15 self._cache: weakref.WeakValueDictionary[str, T] = (
16 weakref.WeakValueDictionary()
17 )
18 self._hits = 0
19 self._misses = 0
20
21 def get(self, key: str, factory: Callable[[], T]) -> T:
22 """
23 Get item from cache, creating it if necessary.
24
25 Args:
26 key: Cache key
27 factory: Function to create value if not cached
28
29 Returns:
30 Cached or newly created value
31 """
32 value = self._cache.get(key)
33 if value is not None:
34 self._hits += 1
35 return value
36
37 self._misses += 1
38 value = factory()
39 self._cache[key] = value
40 return value
41
42 def __len__(self) -> int:
43 return len(self._cache)
44
45 def stats(self) -> dict:
46 """Return cache statistics"""
47 total = self._hits + self._misses
48 hit_rate = self._hits / total if total > 0 else 0
49 return {
50 "hits": self._hits,
51 "misses": self._misses,
52 "hit_rate": f"{hit_rate:.1%}",
53 "size": len(self._cache),
54 }
55
56# Demo: automatic cleanup
57cache = WeakCache[CacheableConfig]()
58
59# Create and cache object
60config1 = cache.get("app.name", lambda: CacheableConfig("app.name", "MyApp"))
61config2 = cache.get("app.name", lambda: CacheableConfig("app.name", "MyApp")) # Cache hit
62
63print(f"Cache size: {len(cache)}") # 1
64print(f"Same object: {config1 is config2}") # True
65print(f"Stats: {cache.stats()}") # hits=1, misses=1
66
67# Delete strong reference
68del config1
69del config2
70
71# Object is garbage collected, cache is auto-cleaned
72import gc
73gc.collect()
74
75print(f"Cache size after cleanup: {len(cache)}") # 0步驟 4:測量記憶體使用
使用 sys.getsizeof 和 tracemalloc 進行精確測量:
1import sys
2import tracemalloc
3from pympler import asizeof # pip install pympler
4
5def measure_object_size(obj, label="Object"):
6 """Measure object size using different methods"""
7
8 # Basic size (doesn't include referenced objects)
9 basic = sys.getsizeof(obj)
10
11 # Deep size (includes all referenced objects)
12 # Using pympler for accurate measurement
13 deep = asizeof.asizeof(obj)
14
15 print(f"{label}:")
16 print(f" sys.getsizeof: {basic:,} bytes")
17 print(f" pympler deep: {deep:,} bytes")
18
19 return basic, deep
20
21# Compare different object types
22item_without = ConfigItemWithoutSlots("key", "value", {"a": 1, "b": 2})
23item_with = ConfigItemWithSlots("key", "value", {"a": 1, "b": 2})
24
25measure_object_size(item_without, "Without __slots__")
26measure_object_size(item_with, "With __slots__")
27
28# Using tracemalloc for allocation tracking
29def track_allocations():
30 """Track memory allocations during execution"""
31 tracemalloc.start()
32
33 # Simulate creating many cached objects
34 items = []
35 for i in range(1000):
36 items.append(ConfigItemWithSlots(
37 f"config.item.{i}",
38 f"value_{i}",
39 {"index": i, "active": True}
40 ))
41
42 # Get snapshot
43 snapshot = tracemalloc.take_snapshot()
44 top_stats = snapshot.statistics('lineno')
45
46 print("\nTop 5 memory allocations:")
47 for stat in top_stats[:5]:
48 print(f" {stat}")
49
50 # Get traced memory
51 current, peak = tracemalloc.get_traced_memory()
52 print(f"\nCurrent memory: {current / 1024:.1f} KB")
53 print(f"Peak memory: {peak / 1024:.1f} KB")
54
55 tracemalloc.stop()
56 return items
57
58track_allocations()比較記憶體差異的完整腳本
1import sys
2import gc
3import tracemalloc
4from dataclasses import dataclass
5
6class StandardConfig:
7 """Standard class with __dict__"""
8 def __init__(self, key, value, metadata):
9 self.key = key
10 self.value = value
11 self.metadata = metadata
12 self.hits = 0
13
14class SlottedConfig:
15 """Optimized with __slots__"""
16 __slots__ = ['key', 'value', 'metadata', 'hits']
17
18 def __init__(self, key, value, metadata):
19 self.key = key
20 self.value = value
21 self.metadata = metadata
22 self.hits = 0
23
24@dataclass
25class DataclassConfig:
26 """Using dataclass"""
27 key: str
28 value: str
29 metadata: dict
30 hits: int = 0
31
32@dataclass(slots=True) # Python 3.10+
33class SlottedDataclass:
34 """Dataclass with __slots__"""
35 key: str
36 value: str
37 metadata: dict
38 hits: int = 0
39
40def benchmark_memory(cls, count=10000, label=""):
41 """Benchmark memory usage for a class"""
42 gc.collect()
43 tracemalloc.start()
44
45 objects = [
46 cls(f"key_{i}", f"value_{i}", {"index": i})
47 for i in range(count)
48 ]
49
50 current, peak = tracemalloc.get_traced_memory()
51 tracemalloc.stop()
52
53 per_object = current / count
54 print(f"{label or cls.__name__:25} | "
55 f"{current/1024:8.1f} KB | "
56 f"{per_object:6.1f} B/obj")
57
58 return objects # Keep reference to prevent GC
59
60print(f"{'Class':25} | {'Total':>10} | {'Per Object':>10}")
61print("-" * 55)
62
63benchmark_memory(StandardConfig)
64benchmark_memory(SlottedConfig)
65benchmark_memory(DataclassConfig)
66benchmark_memory(SlottedDataclass)
67
68# Typical output:
69# Class | Total | Per Object
70# -------------------------------------------------------
71# StandardConfig | 2578.5 KB | 263.6 B/obj
72# SlottedConfig | 859.4 KB | 87.9 B/obj
73# DataclassConfig | 2656.3 KB | 271.6 B/obj
74# SlottedDataclass | 898.4 KB | 91.9 B/obj完整程式碼
以下是整合所有優化技術的完整實作:
1"""
2Memory-optimized configuration cache system.
3
4This module demonstrates:
5- Using __slots__ to reduce object memory footprint
6- Using weakref for automatic cache cleanup
7- Using tracemalloc for memory profiling
8
9Based on patterns from .claude/lib/config_loader.py
10"""
11
12import weakref
13import sys
14import gc
15import tracemalloc
16from typing import Any, Callable, Generic, Optional, TypeVar
17from pathlib import Path
18from datetime import datetime
19
20T = TypeVar('T')
21
22class ConfigEntry:
23 """
24 Memory-optimized configuration entry.
25
26 Uses __slots__ to reduce memory footprint by ~3x compared
27 to regular classes.
28 """
29 __slots__ = [
30 'key', 'value', 'source', 'loaded_at',
31 'access_count', '__weakref__'
32 ]
33
34 def __init__(
35 self,
36 key: str,
37 value: Any,
38 source: Optional[str] = None
39 ):
40 self.key = key
41 self.value = value
42 self.source = source
43 self.loaded_at = datetime.now()
44 self.access_count = 0
45
46 def __repr__(self) -> str:
47 return (
48 f"ConfigEntry(key={self.key!r}, "
49 f"value={self.value!r}, "
50 f"accesses={self.access_count})"
51 )
52
53 def touch(self) -> None:
54 """Record an access to this entry"""
55 self.access_count += 1
56
57class SmartConfigCache:
58 """
59 Smart configuration cache with automatic memory management.
60
61 Features:
62 - Weak references for automatic cleanup
63 - Memory usage tracking
64 - Hit/miss statistics
65 - Optional size limits
66
67 Example:
68 cache = SmartConfigCache(max_size=1000)
69
70 # Get or create config
71 config = cache.get_or_create(
72 "database.host",
73 lambda: ConfigEntry("database.host", "localhost", "env")
74 )
75
76 # Check stats
77 print(cache.stats())
78 """
79
80 def __init__(self, max_size: Optional[int] = None):
81 """
82 Initialize the cache.
83
84 Args:
85 max_size: Maximum number of entries. None for unlimited.
86 """
87 self._cache: weakref.WeakValueDictionary[str, ConfigEntry] = (
88 weakref.WeakValueDictionary()
89 )
90 self._strong_refs: dict[str, ConfigEntry] = {} # Keep important items
91 self._max_size = max_size
92 self._hits = 0
93 self._misses = 0
94 self._evictions = 0
95
96 def get(self, key: str) -> Optional[ConfigEntry]:
97 """
98 Get entry from cache.
99
100 Args:
101 key: Configuration key
102
103 Returns:
104 ConfigEntry if found, None otherwise
105 """
106 # Check strong refs first
107 entry = self._strong_refs.get(key)
108 if entry is not None:
109 self._hits += 1
110 entry.touch()
111 return entry
112
113 # Then check weak refs
114 entry = self._cache.get(key)
115 if entry is not None:
116 self._hits += 1
117 entry.touch()
118 return entry
119
120 self._misses += 1
121 return None
122
123 def get_or_create(
124 self,
125 key: str,
126 factory: Callable[[], ConfigEntry],
127 keep_strong: bool = False
128 ) -> ConfigEntry:
129 """
130 Get existing entry or create new one.
131
132 Args:
133 key: Configuration key
134 factory: Function to create entry if not found
135 keep_strong: If True, keep a strong reference (won't auto-cleanup)
136
137 Returns:
138 Existing or newly created ConfigEntry
139 """
140 entry = self.get(key)
141 if entry is not None:
142 return entry
143
144 # Create new entry
145 entry = factory()
146 self._cache[key] = entry
147
148 if keep_strong:
149 self._enforce_size_limit()
150 self._strong_refs[key] = entry
151
152 return entry
153
154 def _enforce_size_limit(self) -> None:
155 """Evict old entries if cache is full"""
156 if self._max_size is None:
157 return
158
159 while len(self._strong_refs) >= self._max_size:
160 # Evict least accessed entry
161 if not self._strong_refs:
162 break
163
164 min_key = min(
165 self._strong_refs.keys(),
166 key=lambda k: self._strong_refs[k].access_count
167 )
168 del self._strong_refs[min_key]
169 self._evictions += 1
170
171 def pin(self, key: str) -> bool:
172 """
173 Pin an entry to prevent automatic cleanup.
174
175 Args:
176 key: Configuration key
177
178 Returns:
179 True if entry was pinned, False if not found
180 """
181 entry = self._cache.get(key)
182 if entry is None:
183 return False
184
185 self._enforce_size_limit()
186 self._strong_refs[key] = entry
187 return True
188
189 def unpin(self, key: str) -> bool:
190 """
191 Unpin an entry to allow automatic cleanup.
192
193 Args:
194 key: Configuration key
195
196 Returns:
197 True if entry was unpinned, False if not found
198 """
199 if key in self._strong_refs:
200 del self._strong_refs[key]
201 return True
202 return False
203
204 def clear(self) -> None:
205 """Clear all cached entries"""
206 self._cache.clear()
207 self._strong_refs.clear()
208
209 def stats(self) -> dict:
210 """
211 Get cache statistics.
212
213 Returns:
214 Dict with hits, misses, hit_rate, size, pinned, evictions
215 """
216 total = self._hits + self._misses
217 hit_rate = self._hits / total if total > 0 else 0.0
218
219 return {
220 "hits": self._hits,
221 "misses": self._misses,
222 "hit_rate": f"{hit_rate:.1%}",
223 "total_size": len(self._cache) + len(self._strong_refs),
224 "weak_refs": len(self._cache),
225 "pinned": len(self._strong_refs),
226 "evictions": self._evictions,
227 }
228
229 def memory_usage(self) -> dict:
230 """
231 Estimate memory usage of cached entries.
232
233 Returns:
234 Dict with entry_count, estimated_bytes, per_entry_bytes
235 """
236 all_entries = list(self._cache.values()) + list(self._strong_refs.values())
237
238 if not all_entries:
239 return {
240 "entry_count": 0,
241 "estimated_bytes": 0,
242 "per_entry_bytes": 0,
243 }
244
245 # Estimate based on first entry
246 sample = all_entries[0]
247 per_entry = sys.getsizeof(sample)
248
249 return {
250 "entry_count": len(all_entries),
251 "estimated_bytes": per_entry * len(all_entries),
252 "per_entry_bytes": per_entry,
253 }
254
255def demo_memory_optimization():
256 """Demonstrate memory optimization techniques"""
257
258 print("=" * 60)
259 print("Memory Optimization Demo")
260 print("=" * 60)
261
262 # Start memory tracking
263 tracemalloc.start()
264 gc.collect()
265 snapshot1 = tracemalloc.take_snapshot()
266
267 # Create cache and populate
268 cache = SmartConfigCache(max_size=100)
269
270 # Simulate loading many configurations
271 entries = []
272 for i in range(1000):
273 entry = cache.get_or_create(
274 f"config.item.{i}",
275 lambda i=i: ConfigEntry(
276 f"config.item.{i}",
277 f"value_{i}",
278 "demo"
279 ),
280 keep_strong=(i < 100) # Pin first 100
281 )
282 entries.append(entry)
283
284 # Take snapshot after creation
285 gc.collect()
286 snapshot2 = tracemalloc.take_snapshot()
287
288 # Print stats
289 print("\nCache Statistics:")
290 for key, value in cache.stats().items():
291 print(f" {key}: {value}")
292
293 print("\nMemory Usage:")
294 for key, value in cache.memory_usage().items():
295 print(f" {key}: {value:,}")
296
297 # Show memory diff
298 diff = snapshot2.compare_to(snapshot1, 'lineno')
299 print("\nTop 5 Memory Allocations:")
300 for stat in diff[:5]:
301 print(f" {stat}")
302
303 # Demo weak reference cleanup
304 print("\n" + "-" * 60)
305 print("Weak Reference Cleanup Demo")
306 print("-" * 60)
307
308 print(f"Before cleanup - Cache size: {cache.stats()['total_size']}")
309
310 # Delete external references to unpinned entries
311 del entries
312 gc.collect()
313
314 print(f"After cleanup - Cache size: {cache.stats()['total_size']}")
315 print("(Only pinned entries remain)")
316
317 tracemalloc.stop()
318
319if __name__ == "__main__":
320 demo_memory_optimization()使用範例
1from memory_optimized_cache import SmartConfigCache, ConfigEntry
2
3# Initialize cache with size limit
4cache = SmartConfigCache(max_size=1000)
5
6# Load configuration
7def load_database_config():
8 """Factory function to load database config"""
9 return ConfigEntry(
10 key="database",
11 value={
12 "host": "localhost",
13 "port": 5432,
14 "name": "myapp"
15 },
16 source="config.yaml"
17 )
18
19# Get or create (with strong reference for important config)
20db_config = cache.get_or_create(
21 "database",
22 load_database_config,
23 keep_strong=True # Keep in memory
24)
25
26print(f"Database host: {db_config.value['host']}")
27
28# Temporary config (will be auto-cleaned when not referenced)
29temp_config = cache.get_or_create(
30 "temp.setting",
31 lambda: ConfigEntry("temp.setting", "temporary", "runtime")
32)
33
34# Check statistics
35print(cache.stats())
36# {'hits': 0, 'misses': 2, 'hit_rate': '0.0%',
37# 'total_size': 2, 'weak_refs': 1, 'pinned': 1, 'evictions': 0}
38
39# Memory usage
40print(cache.memory_usage())
41# {'entry_count': 2, 'estimated_bytes': 112, 'per_entry_bytes': 56}設計權衡
__slots__ vs 標準類別
| 面向 | 標準類別 | __slots__ |
|---|---|---|
| 記憶體佔用 | 較多(有 __dict__) | 較少(節省 ~60-70%) |
| 動態屬性 | 支援 obj.new_attr = x | 不支援(除非加 __dict__) |
| 繼承 | 簡單 | 子類別需要自己的 __slots__ |
| 弱引用 | 預設支援 | 需要加入 __weakref__ slot |
| Pickle | 直接支援 | 需要 __getstate__/__setstate__ |
| 多重繼承 | 正常運作 | 多個父類別不能都有非空 __slots__ |
強引用 vs 弱引用快取
| 面向 | 強引用快取 | 弱引用快取 |
|---|---|---|
| 記憶體管理 | 需要手動清理 | 自動清理 |
| 資料保證 | 資料一定存在 | 資料可能被回收 |
| 適用場景 | 關鍵配置 | 暫時性資料 |
| 實作複雜度 | 簡單 | 稍微複雜 |
何時使用哪種技術?
1決策樹:
2
3物件數量多嗎?
4├── 是 → 考慮 __slots__
5│ └── 需要動態屬性嗎?
6│ ├── 是 → __slots__ = [..., '__dict__']
7│ └── 否 → __slots__ = [...]
8└── 否 → 標準類別即可
9
10快取可能無限增長嗎?
11├── 是 → 使用 WeakValueDictionary 或 LRU
12└── 否 → 普通字典即可
13
14資料可以被回收嗎?
15├── 是 → weakref
16└── 否 → 強引用什麼時候該用這個技術?
適合使用
- 建立大量小物件:如資料點、事件、配置項目
- 記憶體使用是瓶頸:經過 profiling 確認
- 快取可能無限增長:如用戶 session、請求資料
- 長時間運行的服務:如 web server、daemon
不建議使用
- 物件數量很少:優化效果不明顯
- 需要動態新增屬性:
__slots__會限制彈性 - 過早優化:先確認是否真的有問題
- 程式碼可讀性優先:標準類別更直觀
優化決策流程
1# Step 1: Profile first!
2# Don't optimize until you know where the problem is
3
4import tracemalloc
5
6tracemalloc.start()
7# ... run your code ...
8snapshot = tracemalloc.take_snapshot()
9top_stats = snapshot.statistics('lineno')
10
11for stat in top_stats[:10]:
12 print(stat)
13
14# Step 2: If memory is the issue, identify the class
15# Look for classes with many instances
16
17import gc
18from collections import Counter
19
20counter = Counter(type(obj).__name__ for obj in gc.get_objects())
21print(counter.most_common(10))
22
23# Step 3: Only then apply __slots__ to hot classes練習
基礎練習:比較有無 __slots__ 的記憶體差異
撰寫一個腳本,比較以下三種類別建立 100,000 個實例的記憶體使用:
1# Exercise: Complete this benchmark script
2
3import sys
4import tracemalloc
5from dataclasses import dataclass
6
7# 1. Standard class
8class PointStandard:
9 def __init__(self, x, y, z):
10 self.x = x
11 self.y = y
12 self.z = z
13
14# 2. Class with __slots__
15class PointSlots:
16 __slots__ = ['x', 'y', 'z']
17 def __init__(self, x, y, z):
18 self.x = x
19 self.y = y
20 self.z = z
21
22# 3. Named tuple
23from collections import namedtuple
24PointNamed = namedtuple('PointNamed', ['x', 'y', 'z'])
25
26# TODO: Write benchmark function
27def benchmark(cls, count=100000):
28 """Measure memory for creating `count` instances"""
29 pass # Implement this
30
31# TODO: Compare results
32# Expected: PointSlots uses ~3x less memory than PointStandard進階練習:實作 weakref 快取
建立一個 ImageCache 類別,具有以下功能:
1# Exercise: Implement ImageCache
2
3class ImageCache:
4 """
5 Cache for image data with automatic cleanup.
6
7 Requirements:
8 - Use WeakValueDictionary for auto-cleanup
9 - Track hit/miss statistics
10 - Support maximum size limit
11 - Provide memory usage estimation
12
13 Example usage:
14 cache = ImageCache(max_size=100)
15
16 img = cache.get_or_load("photo.jpg", lambda: load_image("photo.jpg"))
17
18 print(cache.stats())
19 # {'hits': 0, 'misses': 1, 'size': 1}
20 """
21
22 def __init__(self, max_size=None):
23 # TODO: Initialize cache
24 pass
25
26 def get_or_load(self, key, loader):
27 # TODO: Implement get or load logic
28 pass
29
30 def stats(self):
31 # TODO: Return cache statistics
32 pass挑戰題:用 tracemalloc 追蹤記憶體洩漏
給定以下有記憶體洩漏的程式碼,使用 tracemalloc 找出問題並修復:
1# Exercise: Find and fix the memory leak
2
3class EventHandler:
4 _handlers = [] # Class variable - potential leak!
5
6 def __init__(self, name):
7 self.name = name
8 self.callbacks = []
9 EventHandler._handlers.append(self) # Leak: strong reference
10
11 def register(self, callback):
12 self.callbacks.append(callback)
13
14 def fire(self, event):
15 for cb in self.callbacks:
16 cb(event)
17
18def process_events():
19 """This function creates handlers but never cleans them up"""
20 for i in range(1000):
21 handler = EventHandler(f"handler_{i}")
22 handler.register(lambda e: print(e))
23 handler.fire(f"event_{i}")
24 # handler goes out of scope but is still in _handlers!
25
26# TODO:
27# 1. Use tracemalloc to measure memory growth
28# 2. Identify the leak
29# 3. Fix EventHandler to use weak references
30# 4. Verify the fix with tracemalloc延伸閱讀
__slots__官方文件- weakref 官方文件
- tracemalloc 官方文件
- Pympler - Memory profiling
- Python Memory Management - Real Python
上一章:效能分析實戰 返回:模組四:CPython 內部機制
#python #python-advanced #cpython #memory #optimization #case-study