本案例基於 .claude/lib/config_loader.py 的實際程式碼,展示如何用 __slots__weakref 優化記憶體使用。

先備知識

問題背景

現有設計

config_loader.py 使用全域字典作為快取,這是一個常見的設計模式:

 1# Global cache variables
 2_agents_config_cache: Optional[dict] = None
 3_quality_rules_cache: Optional[dict] = None
 4
 5def load_agents_config() -> dict:
 6    """
 7    載入代理人配置
 8
 9    使用模組層級變數作為快取,避免重複讀取檔案。
10    """
11    global _agents_config_cache
12    if _agents_config_cache is None:
13        try:
14            _agents_config_cache = load_config("agents")
15        except FileNotFoundError:
16            _agents_config_cache = _get_default_agents_config()
17    return _agents_config_cache
18
19def clear_config_cache() -> None:
20    """清除配置快取(用於測試或配置熱更新)"""
21    global _agents_config_cache, _quality_rules_cache
22    _agents_config_cache = None
23    _quality_rules_cache = None

這種設計簡單直觀,但當系統需要快取更複雜的物件時,會遇到記憶體問題。

記憶體問題

當快取大量物件時:

  • Python 字典有額外開銷:每個字典需要維護 hash table、keys、values
  • 物件的 __dict__ 佔用記憶體:每個實例都有自己的屬性字典
  • 快取可能導致記憶體洩漏:強引用阻止物件被回收

讓我們用一個更複雜的快取場景來說明問題:

 1import sys
 2
 3class ConfigItem:
 4    """配置項目,模擬複雜的快取物件"""
 5    def __init__(self, key: str, value: str, metadata: dict):
 6        self.key = key
 7        self.value = value
 8        self.metadata = metadata
 9        self.access_count = 0
10        self.last_accessed = None
11
12# Create a config item and measure memory
13item = ConfigItem("database.host", "localhost", {"type": "string"})
14
15# Object size
16print(f"ConfigItem size: {sys.getsizeof(item)} bytes")
17# ConfigItem size: 48 bytes
18
19# But the real cost is in __dict__
20print(f"__dict__ size: {sys.getsizeof(item.__dict__)} bytes")
21# __dict__ size: 184 bytes

當快取數萬個這樣的物件時,記憶體開銷會非常可觀。


進階解決方案

優化目標

  1. 減少每個物件的記憶體佔用
  2. 避免快取導致的記憶體洩漏
  3. 保持 API 不變

實作步驟

步驟 1:使用 __slots__ 減少物件大小

__slots__ 告訴 Python 這個類別只會有哪些屬性,讓直譯器可以用更緊湊的方式儲存資料:

 1import sys
 2
 3class ConfigItemWithoutSlots:
 4    """標準類別,使用 __dict__"""
 5    def __init__(self, key: str, value: str, metadata: dict):
 6        self.key = key
 7        self.value = value
 8        self.metadata = metadata
 9        self.access_count = 0
10        self.last_accessed = None
11
12class ConfigItemWithSlots:
13    """使用 __slots__ 優化記憶體"""
14    __slots__ = ['key', 'value', 'metadata', 'access_count', 'last_accessed']
15
16    def __init__(self, key: str, value: str, metadata: dict):
17        self.key = key
18        self.value = value
19        self.metadata = metadata
20        self.access_count = 0
21        self.last_accessed = None
22
23# Compare memory usage
24item_without = ConfigItemWithoutSlots("db.host", "localhost", {"type": "str"})
25item_with = ConfigItemWithSlots("db.host", "localhost", {"type": "str"})
26
27print(f"Without __slots__: {sys.getsizeof(item_without)} bytes")
28print(f"With __slots__:    {sys.getsizeof(item_with)} bytes")
29
30# The real difference is __dict__
31print(f"__dict__ overhead: {sys.getsizeof(item_without.__dict__)} bytes")
32# item_with has no __dict__!
33try:
34    item_with.__dict__
35except AttributeError as e:
36    print(f"No __dict__: {e}")
記憶體結構比較
 1沒有 __slots__:
 2┌──────────────────────────────────┐
 3│ PyObject header         (16 B)   │
 4│ __dict__ 指標            (8 B)   │
 5│ __weakref__ 指標         (8 B)   │
 6│                                  │
 7│ __dict__ (separate object):      │
 8│   - hash table          (64 B)   │
 9│   - keys array          (40 B)   │
10│   - values array        (40 B)   │
11│   - key strings        (~80 B)   │
12│                                  │
13│ Total: ~256 bytes per object     │
14└──────────────────────────────────┘
15
16有 __slots__:
17┌──────────────────────────────────┐
18│ PyObject header         (16 B)   │
19│ key slot                 (8 B)   │
20│ value slot               (8 B)   │
21│ metadata slot            (8 B)   │
22│ access_count slot        (8 B)   │
23│ last_accessed slot       (8 B)   │
24│                                  │
25│ Total: ~56 bytes per object      │
26└──────────────────────────────────┘
大量物件的記憶體節省
 1import sys
 2import tracemalloc
 3
 4def measure_memory(cls, count=10000):
 5    """Measure memory for creating multiple objects"""
 6    tracemalloc.start()
 7
 8    objects = [
 9        cls(f"key_{i}", f"value_{i}", {"index": i})
10        for i in range(count)
11    ]
12
13    current, peak = tracemalloc.get_traced_memory()
14    tracemalloc.stop()
15
16    return current, peak, objects
17
18# Measure both classes
19mem_without, peak_without, _ = measure_memory(ConfigItemWithoutSlots)
20mem_with, peak_with, _ = measure_memory(ConfigItemWithSlots)
21
22print(f"Without __slots__: {mem_without / 1024 / 1024:.2f} MB")
23print(f"With __slots__:    {mem_with / 1024 / 1024:.2f} MB")
24print(f"Savings:           {(mem_without - mem_with) / 1024 / 1024:.2f} MB")
25print(f"Ratio:             {mem_without / mem_with:.1f}x")
26
27# Typical output:
28# Without __slots__: 3.82 MB
29# With __slots__:    1.15 MB
30# Savings:           2.67 MB
31# Ratio:             3.3x

步驟 2:使用 weakref 避免強引用

weakref 讓我們可以引用物件,但不阻止它被垃圾回收:

 1import weakref
 2
 3class CacheableConfig:
 4    """可以被弱引用的配置物件"""
 5    __slots__ = ['key', 'value', '_data', '__weakref__']  # Note: __weakref__ slot
 6
 7    def __init__(self, key: str, value: str):
 8        self.key = key
 9        self.value = value
10        self._data = None
11
12    def __repr__(self):
13        return f"CacheableConfig({self.key!r}, {self.value!r})"
14
15# Create object and weak reference
16config = CacheableConfig("app.name", "MyApp")
17weak_ref = weakref.ref(config)
18
19print(f"Object exists: {weak_ref()}")
20# Object exists: CacheableConfig('app.name', 'MyApp')
21
22# Delete the strong reference
23del config
24
25print(f"After del: {weak_ref()}")
26# After del: None
使用 callback 追蹤物件回收
 1import weakref
 2
 3def on_finalize(ref):
 4    """Callback when object is garbage collected"""
 5    print(f"Object was garbage collected!")
 6
 7config = CacheableConfig("db.port", "5432")
 8weak_ref = weakref.ref(config, on_finalize)
 9
10print("Deleting object...")
11del config
12# Output: Object was garbage collected!

步驟 3:使用 WeakValueDictionary

WeakValueDictionary 是實作自動清理快取的利器:

 1import weakref
 2from typing import Callable, TypeVar, Generic
 3
 4T = TypeVar('T')
 5
 6class WeakCache(Generic[T]):
 7    """
 8    Auto-cleaning cache using weak references.
 9
10    Objects are automatically removed from cache when no strong
11    references exist outside the cache.
12    """
13
14    def __init__(self):
15        self._cache: weakref.WeakValueDictionary[str, T] = (
16            weakref.WeakValueDictionary()
17        )
18        self._hits = 0
19        self._misses = 0
20
21    def get(self, key: str, factory: Callable[[], T]) -> T:
22        """
23        Get item from cache, creating it if necessary.
24
25        Args:
26            key: Cache key
27            factory: Function to create value if not cached
28
29        Returns:
30            Cached or newly created value
31        """
32        value = self._cache.get(key)
33        if value is not None:
34            self._hits += 1
35            return value
36
37        self._misses += 1
38        value = factory()
39        self._cache[key] = value
40        return value
41
42    def __len__(self) -> int:
43        return len(self._cache)
44
45    def stats(self) -> dict:
46        """Return cache statistics"""
47        total = self._hits + self._misses
48        hit_rate = self._hits / total if total > 0 else 0
49        return {
50            "hits": self._hits,
51            "misses": self._misses,
52            "hit_rate": f"{hit_rate:.1%}",
53            "size": len(self._cache),
54        }
55
56# Demo: automatic cleanup
57cache = WeakCache[CacheableConfig]()
58
59# Create and cache object
60config1 = cache.get("app.name", lambda: CacheableConfig("app.name", "MyApp"))
61config2 = cache.get("app.name", lambda: CacheableConfig("app.name", "MyApp"))  # Cache hit
62
63print(f"Cache size: {len(cache)}")  # 1
64print(f"Same object: {config1 is config2}")  # True
65print(f"Stats: {cache.stats()}")  # hits=1, misses=1
66
67# Delete strong reference
68del config1
69del config2
70
71# Object is garbage collected, cache is auto-cleaned
72import gc
73gc.collect()
74
75print(f"Cache size after cleanup: {len(cache)}")  # 0

步驟 4:測量記憶體使用

使用 sys.getsizeoftracemalloc 進行精確測量:

 1import sys
 2import tracemalloc
 3from pympler import asizeof  # pip install pympler
 4
 5def measure_object_size(obj, label="Object"):
 6    """Measure object size using different methods"""
 7
 8    # Basic size (doesn't include referenced objects)
 9    basic = sys.getsizeof(obj)
10
11    # Deep size (includes all referenced objects)
12    # Using pympler for accurate measurement
13    deep = asizeof.asizeof(obj)
14
15    print(f"{label}:")
16    print(f"  sys.getsizeof: {basic:,} bytes")
17    print(f"  pympler deep:  {deep:,} bytes")
18
19    return basic, deep
20
21# Compare different object types
22item_without = ConfigItemWithoutSlots("key", "value", {"a": 1, "b": 2})
23item_with = ConfigItemWithSlots("key", "value", {"a": 1, "b": 2})
24
25measure_object_size(item_without, "Without __slots__")
26measure_object_size(item_with, "With __slots__")
27
28# Using tracemalloc for allocation tracking
29def track_allocations():
30    """Track memory allocations during execution"""
31    tracemalloc.start()
32
33    # Simulate creating many cached objects
34    items = []
35    for i in range(1000):
36        items.append(ConfigItemWithSlots(
37            f"config.item.{i}",
38            f"value_{i}",
39            {"index": i, "active": True}
40        ))
41
42    # Get snapshot
43    snapshot = tracemalloc.take_snapshot()
44    top_stats = snapshot.statistics('lineno')
45
46    print("\nTop 5 memory allocations:")
47    for stat in top_stats[:5]:
48        print(f"  {stat}")
49
50    # Get traced memory
51    current, peak = tracemalloc.get_traced_memory()
52    print(f"\nCurrent memory: {current / 1024:.1f} KB")
53    print(f"Peak memory:    {peak / 1024:.1f} KB")
54
55    tracemalloc.stop()
56    return items
57
58track_allocations()
比較記憶體差異的完整腳本
 1import sys
 2import gc
 3import tracemalloc
 4from dataclasses import dataclass
 5
 6class StandardConfig:
 7    """Standard class with __dict__"""
 8    def __init__(self, key, value, metadata):
 9        self.key = key
10        self.value = value
11        self.metadata = metadata
12        self.hits = 0
13
14class SlottedConfig:
15    """Optimized with __slots__"""
16    __slots__ = ['key', 'value', 'metadata', 'hits']
17
18    def __init__(self, key, value, metadata):
19        self.key = key
20        self.value = value
21        self.metadata = metadata
22        self.hits = 0
23
24@dataclass
25class DataclassConfig:
26    """Using dataclass"""
27    key: str
28    value: str
29    metadata: dict
30    hits: int = 0
31
32@dataclass(slots=True)  # Python 3.10+
33class SlottedDataclass:
34    """Dataclass with __slots__"""
35    key: str
36    value: str
37    metadata: dict
38    hits: int = 0
39
40def benchmark_memory(cls, count=10000, label=""):
41    """Benchmark memory usage for a class"""
42    gc.collect()
43    tracemalloc.start()
44
45    objects = [
46        cls(f"key_{i}", f"value_{i}", {"index": i})
47        for i in range(count)
48    ]
49
50    current, peak = tracemalloc.get_traced_memory()
51    tracemalloc.stop()
52
53    per_object = current / count
54    print(f"{label or cls.__name__:25} | "
55          f"{current/1024:8.1f} KB | "
56          f"{per_object:6.1f} B/obj")
57
58    return objects  # Keep reference to prevent GC
59
60print(f"{'Class':25} | {'Total':>10} | {'Per Object':>10}")
61print("-" * 55)
62
63benchmark_memory(StandardConfig)
64benchmark_memory(SlottedConfig)
65benchmark_memory(DataclassConfig)
66benchmark_memory(SlottedDataclass)
67
68# Typical output:
69# Class                     |      Total |  Per Object
70# -------------------------------------------------------
71# StandardConfig            |   2578.5 KB |  263.6 B/obj
72# SlottedConfig             |    859.4 KB |   87.9 B/obj
73# DataclassConfig           |   2656.3 KB |  271.6 B/obj
74# SlottedDataclass          |    898.4 KB |   91.9 B/obj

完整程式碼

以下是整合所有優化技術的完整實作:

  1"""
  2Memory-optimized configuration cache system.
  3
  4This module demonstrates:
  5- Using __slots__ to reduce object memory footprint
  6- Using weakref for automatic cache cleanup
  7- Using tracemalloc for memory profiling
  8
  9Based on patterns from .claude/lib/config_loader.py
 10"""
 11
 12import weakref
 13import sys
 14import gc
 15import tracemalloc
 16from typing import Any, Callable, Generic, Optional, TypeVar
 17from pathlib import Path
 18from datetime import datetime
 19
 20T = TypeVar('T')
 21
 22class ConfigEntry:
 23    """
 24    Memory-optimized configuration entry.
 25
 26    Uses __slots__ to reduce memory footprint by ~3x compared
 27    to regular classes.
 28    """
 29    __slots__ = [
 30        'key', 'value', 'source', 'loaded_at',
 31        'access_count', '__weakref__'
 32    ]
 33
 34    def __init__(
 35        self,
 36        key: str,
 37        value: Any,
 38        source: Optional[str] = None
 39    ):
 40        self.key = key
 41        self.value = value
 42        self.source = source
 43        self.loaded_at = datetime.now()
 44        self.access_count = 0
 45
 46    def __repr__(self) -> str:
 47        return (
 48            f"ConfigEntry(key={self.key!r}, "
 49            f"value={self.value!r}, "
 50            f"accesses={self.access_count})"
 51        )
 52
 53    def touch(self) -> None:
 54        """Record an access to this entry"""
 55        self.access_count += 1
 56
 57class SmartConfigCache:
 58    """
 59    Smart configuration cache with automatic memory management.
 60
 61    Features:
 62    - Weak references for automatic cleanup
 63    - Memory usage tracking
 64    - Hit/miss statistics
 65    - Optional size limits
 66
 67    Example:
 68        cache = SmartConfigCache(max_size=1000)
 69
 70        # Get or create config
 71        config = cache.get_or_create(
 72            "database.host",
 73            lambda: ConfigEntry("database.host", "localhost", "env")
 74        )
 75
 76        # Check stats
 77        print(cache.stats())
 78    """
 79
 80    def __init__(self, max_size: Optional[int] = None):
 81        """
 82        Initialize the cache.
 83
 84        Args:
 85            max_size: Maximum number of entries. None for unlimited.
 86        """
 87        self._cache: weakref.WeakValueDictionary[str, ConfigEntry] = (
 88            weakref.WeakValueDictionary()
 89        )
 90        self._strong_refs: dict[str, ConfigEntry] = {}  # Keep important items
 91        self._max_size = max_size
 92        self._hits = 0
 93        self._misses = 0
 94        self._evictions = 0
 95
 96    def get(self, key: str) -> Optional[ConfigEntry]:
 97        """
 98        Get entry from cache.
 99
100        Args:
101            key: Configuration key
102
103        Returns:
104            ConfigEntry if found, None otherwise
105        """
106        # Check strong refs first
107        entry = self._strong_refs.get(key)
108        if entry is not None:
109            self._hits += 1
110            entry.touch()
111            return entry
112
113        # Then check weak refs
114        entry = self._cache.get(key)
115        if entry is not None:
116            self._hits += 1
117            entry.touch()
118            return entry
119
120        self._misses += 1
121        return None
122
123    def get_or_create(
124        self,
125        key: str,
126        factory: Callable[[], ConfigEntry],
127        keep_strong: bool = False
128    ) -> ConfigEntry:
129        """
130        Get existing entry or create new one.
131
132        Args:
133            key: Configuration key
134            factory: Function to create entry if not found
135            keep_strong: If True, keep a strong reference (won't auto-cleanup)
136
137        Returns:
138            Existing or newly created ConfigEntry
139        """
140        entry = self.get(key)
141        if entry is not None:
142            return entry
143
144        # Create new entry
145        entry = factory()
146        self._cache[key] = entry
147
148        if keep_strong:
149            self._enforce_size_limit()
150            self._strong_refs[key] = entry
151
152        return entry
153
154    def _enforce_size_limit(self) -> None:
155        """Evict old entries if cache is full"""
156        if self._max_size is None:
157            return
158
159        while len(self._strong_refs) >= self._max_size:
160            # Evict least accessed entry
161            if not self._strong_refs:
162                break
163
164            min_key = min(
165                self._strong_refs.keys(),
166                key=lambda k: self._strong_refs[k].access_count
167            )
168            del self._strong_refs[min_key]
169            self._evictions += 1
170
171    def pin(self, key: str) -> bool:
172        """
173        Pin an entry to prevent automatic cleanup.
174
175        Args:
176            key: Configuration key
177
178        Returns:
179            True if entry was pinned, False if not found
180        """
181        entry = self._cache.get(key)
182        if entry is None:
183            return False
184
185        self._enforce_size_limit()
186        self._strong_refs[key] = entry
187        return True
188
189    def unpin(self, key: str) -> bool:
190        """
191        Unpin an entry to allow automatic cleanup.
192
193        Args:
194            key: Configuration key
195
196        Returns:
197            True if entry was unpinned, False if not found
198        """
199        if key in self._strong_refs:
200            del self._strong_refs[key]
201            return True
202        return False
203
204    def clear(self) -> None:
205        """Clear all cached entries"""
206        self._cache.clear()
207        self._strong_refs.clear()
208
209    def stats(self) -> dict:
210        """
211        Get cache statistics.
212
213        Returns:
214            Dict with hits, misses, hit_rate, size, pinned, evictions
215        """
216        total = self._hits + self._misses
217        hit_rate = self._hits / total if total > 0 else 0.0
218
219        return {
220            "hits": self._hits,
221            "misses": self._misses,
222            "hit_rate": f"{hit_rate:.1%}",
223            "total_size": len(self._cache) + len(self._strong_refs),
224            "weak_refs": len(self._cache),
225            "pinned": len(self._strong_refs),
226            "evictions": self._evictions,
227        }
228
229    def memory_usage(self) -> dict:
230        """
231        Estimate memory usage of cached entries.
232
233        Returns:
234            Dict with entry_count, estimated_bytes, per_entry_bytes
235        """
236        all_entries = list(self._cache.values()) + list(self._strong_refs.values())
237
238        if not all_entries:
239            return {
240                "entry_count": 0,
241                "estimated_bytes": 0,
242                "per_entry_bytes": 0,
243            }
244
245        # Estimate based on first entry
246        sample = all_entries[0]
247        per_entry = sys.getsizeof(sample)
248
249        return {
250            "entry_count": len(all_entries),
251            "estimated_bytes": per_entry * len(all_entries),
252            "per_entry_bytes": per_entry,
253        }
254
255def demo_memory_optimization():
256    """Demonstrate memory optimization techniques"""
257
258    print("=" * 60)
259    print("Memory Optimization Demo")
260    print("=" * 60)
261
262    # Start memory tracking
263    tracemalloc.start()
264    gc.collect()
265    snapshot1 = tracemalloc.take_snapshot()
266
267    # Create cache and populate
268    cache = SmartConfigCache(max_size=100)
269
270    # Simulate loading many configurations
271    entries = []
272    for i in range(1000):
273        entry = cache.get_or_create(
274            f"config.item.{i}",
275            lambda i=i: ConfigEntry(
276                f"config.item.{i}",
277                f"value_{i}",
278                "demo"
279            ),
280            keep_strong=(i < 100)  # Pin first 100
281        )
282        entries.append(entry)
283
284    # Take snapshot after creation
285    gc.collect()
286    snapshot2 = tracemalloc.take_snapshot()
287
288    # Print stats
289    print("\nCache Statistics:")
290    for key, value in cache.stats().items():
291        print(f"  {key}: {value}")
292
293    print("\nMemory Usage:")
294    for key, value in cache.memory_usage().items():
295        print(f"  {key}: {value:,}")
296
297    # Show memory diff
298    diff = snapshot2.compare_to(snapshot1, 'lineno')
299    print("\nTop 5 Memory Allocations:")
300    for stat in diff[:5]:
301        print(f"  {stat}")
302
303    # Demo weak reference cleanup
304    print("\n" + "-" * 60)
305    print("Weak Reference Cleanup Demo")
306    print("-" * 60)
307
308    print(f"Before cleanup - Cache size: {cache.stats()['total_size']}")
309
310    # Delete external references to unpinned entries
311    del entries
312    gc.collect()
313
314    print(f"After cleanup  - Cache size: {cache.stats()['total_size']}")
315    print("(Only pinned entries remain)")
316
317    tracemalloc.stop()
318
319if __name__ == "__main__":
320    demo_memory_optimization()

使用範例

 1from memory_optimized_cache import SmartConfigCache, ConfigEntry
 2
 3# Initialize cache with size limit
 4cache = SmartConfigCache(max_size=1000)
 5
 6# Load configuration
 7def load_database_config():
 8    """Factory function to load database config"""
 9    return ConfigEntry(
10        key="database",
11        value={
12            "host": "localhost",
13            "port": 5432,
14            "name": "myapp"
15        },
16        source="config.yaml"
17    )
18
19# Get or create (with strong reference for important config)
20db_config = cache.get_or_create(
21    "database",
22    load_database_config,
23    keep_strong=True  # Keep in memory
24)
25
26print(f"Database host: {db_config.value['host']}")
27
28# Temporary config (will be auto-cleaned when not referenced)
29temp_config = cache.get_or_create(
30    "temp.setting",
31    lambda: ConfigEntry("temp.setting", "temporary", "runtime")
32)
33
34# Check statistics
35print(cache.stats())
36# {'hits': 0, 'misses': 2, 'hit_rate': '0.0%',
37#  'total_size': 2, 'weak_refs': 1, 'pinned': 1, 'evictions': 0}
38
39# Memory usage
40print(cache.memory_usage())
41# {'entry_count': 2, 'estimated_bytes': 112, 'per_entry_bytes': 56}

設計權衡

__slots__ vs 標準類別

面向標準類別__slots__
記憶體佔用較多(有 __dict__較少(節省 ~60-70%)
動態屬性支援 obj.new_attr = x不支援(除非加 __dict__
繼承簡單子類別需要自己的 __slots__
弱引用預設支援需要加入 __weakref__ slot
Pickle直接支援需要 __getstate__/__setstate__
多重繼承正常運作多個父類別不能都有非空 __slots__

強引用 vs 弱引用快取

面向強引用快取弱引用快取
記憶體管理需要手動清理自動清理
資料保證資料一定存在資料可能被回收
適用場景關鍵配置暫時性資料
實作複雜度簡單稍微複雜

何時使用哪種技術?

 1決策樹:
 2
 3物件數量多嗎?
 4├── 是 → 考慮 __slots__
 5│   └── 需要動態屬性嗎?
 6│       ├── 是 → __slots__ = [..., '__dict__']
 7│       └── 否 → __slots__ = [...]
 8└── 否 → 標準類別即可
 9
10快取可能無限增長嗎?
11├── 是 → 使用 WeakValueDictionary 或 LRU
12└── 否 → 普通字典即可
13
14資料可以被回收嗎?
15├── 是 → weakref
16└── 否 → 強引用

什麼時候該用這個技術?

適合使用

  • 建立大量小物件:如資料點、事件、配置項目
  • 記憶體使用是瓶頸:經過 profiling 確認
  • 快取可能無限增長:如用戶 session、請求資料
  • 長時間運行的服務:如 web server、daemon

不建議使用

  • 物件數量很少:優化效果不明顯
  • 需要動態新增屬性__slots__ 會限制彈性
  • 過早優化:先確認是否真的有問題
  • 程式碼可讀性優先:標準類別更直觀

優化決策流程

 1# Step 1: Profile first!
 2# Don't optimize until you know where the problem is
 3
 4import tracemalloc
 5
 6tracemalloc.start()
 7# ... run your code ...
 8snapshot = tracemalloc.take_snapshot()
 9top_stats = snapshot.statistics('lineno')
10
11for stat in top_stats[:10]:
12    print(stat)
13
14# Step 2: If memory is the issue, identify the class
15# Look for classes with many instances
16
17import gc
18from collections import Counter
19
20counter = Counter(type(obj).__name__ for obj in gc.get_objects())
21print(counter.most_common(10))
22
23# Step 3: Only then apply __slots__ to hot classes

練習

基礎練習:比較有無 __slots__ 的記憶體差異

撰寫一個腳本,比較以下三種類別建立 100,000 個實例的記憶體使用:

 1# Exercise: Complete this benchmark script
 2
 3import sys
 4import tracemalloc
 5from dataclasses import dataclass
 6
 7# 1. Standard class
 8class PointStandard:
 9    def __init__(self, x, y, z):
10        self.x = x
11        self.y = y
12        self.z = z
13
14# 2. Class with __slots__
15class PointSlots:
16    __slots__ = ['x', 'y', 'z']
17    def __init__(self, x, y, z):
18        self.x = x
19        self.y = y
20        self.z = z
21
22# 3. Named tuple
23from collections import namedtuple
24PointNamed = namedtuple('PointNamed', ['x', 'y', 'z'])
25
26# TODO: Write benchmark function
27def benchmark(cls, count=100000):
28    """Measure memory for creating `count` instances"""
29    pass  # Implement this
30
31# TODO: Compare results
32# Expected: PointSlots uses ~3x less memory than PointStandard

進階練習:實作 weakref 快取

建立一個 ImageCache 類別,具有以下功能:

 1# Exercise: Implement ImageCache
 2
 3class ImageCache:
 4    """
 5    Cache for image data with automatic cleanup.
 6
 7    Requirements:
 8    - Use WeakValueDictionary for auto-cleanup
 9    - Track hit/miss statistics
10    - Support maximum size limit
11    - Provide memory usage estimation
12
13    Example usage:
14        cache = ImageCache(max_size=100)
15
16        img = cache.get_or_load("photo.jpg", lambda: load_image("photo.jpg"))
17
18        print(cache.stats())
19        # {'hits': 0, 'misses': 1, 'size': 1}
20    """
21
22    def __init__(self, max_size=None):
23        # TODO: Initialize cache
24        pass
25
26    def get_or_load(self, key, loader):
27        # TODO: Implement get or load logic
28        pass
29
30    def stats(self):
31        # TODO: Return cache statistics
32        pass

挑戰題:用 tracemalloc 追蹤記憶體洩漏

給定以下有記憶體洩漏的程式碼,使用 tracemalloc 找出問題並修復:

 1# Exercise: Find and fix the memory leak
 2
 3class EventHandler:
 4    _handlers = []  # Class variable - potential leak!
 5
 6    def __init__(self, name):
 7        self.name = name
 8        self.callbacks = []
 9        EventHandler._handlers.append(self)  # Leak: strong reference
10
11    def register(self, callback):
12        self.callbacks.append(callback)
13
14    def fire(self, event):
15        for cb in self.callbacks:
16            cb(event)
17
18def process_events():
19    """This function creates handlers but never cleans them up"""
20    for i in range(1000):
21        handler = EventHandler(f"handler_{i}")
22        handler.register(lambda e: print(e))
23        handler.fire(f"event_{i}")
24        # handler goes out of scope but is still in _handlers!
25
26# TODO:
27# 1. Use tracemalloc to measure memory growth
28# 2. Identify the leak
29# 3. Fix EventHandler to use weak references
30# 4. Verify the fix with tracemalloc

延伸閱讀


上一章:效能分析實戰 返回:模組四:CPython 內部機制