feat: improve SPA scraping and increase test coverage
- Add SPA support for Playwright with wait_for_network_idle and extra_wait_ms - Add BaseStore.get_spa_config() and requires_playwright() methods - Implement AliExpress SPA config with JSON price extraction patterns - Fix Amazon price parsing to prioritize whole+fraction combination - Fix AliExpress regex patterns (remove double backslashes) - Add CLI tests: detect, doctor, fetch, parse, run commands - Add API tests: auth, logs, products, scraping_logs, webhooks Tests: 417 passed, 85% coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -152,5 +152,32 @@ class BaseStore(ABC):
|
||||
"""
|
||||
return self.selectors.get(key, default)
|
||||
|
||||
def get_spa_config(self) -> Optional[dict]:
|
||||
"""
|
||||
Retourne la configuration SPA pour Playwright si ce store est une SPA.
|
||||
|
||||
Returns:
|
||||
dict avec les options Playwright ou None si pas une SPA:
|
||||
- wait_for_selector: Sélecteur CSS à attendre avant scraping
|
||||
- wait_for_network_idle: Attendre que le réseau soit inactif
|
||||
- extra_wait_ms: Délai supplémentaire après chargement
|
||||
|
||||
Par défaut retourne None (pas de config SPA spécifique).
|
||||
Les stores SPA doivent surcharger cette méthode.
|
||||
"""
|
||||
return None
|
||||
|
||||
def requires_playwright(self) -> bool:
|
||||
"""
|
||||
Indique si ce store nécessite obligatoirement Playwright.
|
||||
|
||||
Returns:
|
||||
True si Playwright est requis, False sinon
|
||||
|
||||
Par défaut False. Les stores avec anti-bot agressif ou
|
||||
rendu SPA obligatoire doivent surcharger cette méthode.
|
||||
"""
|
||||
return False
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"<{self.__class__.__name__} id={self.store_id}>"
|
||||
|
||||
Reference in New Issue
Block a user