wms.config#

Configuration module.

Provides dataclasses and helper functions that load, validate and expose configuration for the different parts of the system.

Notably, connection parameters including the security context must be handled for both Postgres and Kafka.

Monitoring specs may be loaded from a dedicated YAML file. They may be in the thousands, so they are marshalled in named tuples for efficiency.

Module Contents#

Classes#

WebsiteSpec

Website scraping spec.

ScrapingConfig

Scraping service configuration.

PersistenceConfig

Persistence service configuration.

PostgresClientConfig

Postgres client configuration.

AppConfig

Structured app configuration object. Not used ATM.

Functions#

load_valid_settings(→ dynaconf.Dynaconf)

Find, load and validate settings for given env and config dir.

get_certs_dir(→ Path | None)

Try to extract TLS certs dir from settings.

load_yaml_monitor_specs(→ list[dict[str, Any]])

Validate site scraping config file and load specs.

build_security_context_params(→ dict[str, Any])

Build the security context params.

get_security_context(→ tuple[str, SSLContext | None])

Creates the SSL context if possible, mutating the passed config.

build_kafka_producer(→ aiokafka.producer.AIOKafkaProducer)

Marshal kafka producer config and build instance with transactional semantics.

build_kafka_consumer(→ aiokafka.consumer.AIOKafkaConsumer)

Marshal kafka consumer config and build object with manual commit semantics.

build_postgres_dsn(→ yarl.URL)

Build PostgreSQL connection URL from config.

Attributes#

wms.config.MIN_ALLOWED_FREQ_MS :Final[int] = 20#
wms.config.logger :logging.Logger#
wms.config.KafkaConfig :TypeAlias#
wms.config.SSL_CTX_KEYS :tuple[str, Ellipsis] = ['cafile', 'capath', 'cadata', 'certfile', 'keyfile', 'password']#
wms.config.WMS_ENV_SWITCHER :Final[str] = WMS_ENV#
wms.config.load_valid_settings(env: str, config_dir: Path | str) dynaconf.Dynaconf#

Find, load and validate settings for given env and config dir.

wms.config.get_certs_dir(settings: dynaconf.Dynaconf) Path | None#

Try to extract TLS certs dir from settings.

wms.config.load_yaml_monitor_specs(config_dir: pathlib.Path, specs_filename: str) list[dict[str, Any]]#

Validate site scraping config file and load specs.

wms.config.build_security_context_params(config: collections.abc.Mapping[str, Any], *, attrs: collections.abc.Iterable[str] = SSL_CTX_KEYS, certs_dir: Path | None = None) dict[str, Any]#

Build the security context params.

SSL context needs absolute paths, so try to resolve them. If the path is a filename, resolve relative to the certificates’ dir. Raise if any of the filepath doens’t point to a file.

wms.config.get_security_context(config: collections.abc.Mapping[str, Any], certs_dir: Path | None) tuple[str, SSLContext | None]#

Creates the SSL context if possible, mutating the passed config.

wms.config.build_kafka_producer(config: KafkaConfig, certs_dir: Path | None, *, transactional: bool = True) aiokafka.producer.AIOKafkaProducer#

Marshal kafka producer config and build instance with transactional semantics.

wms.config.build_kafka_consumer(config: KafkaConfig, certs_dir: Path | None, *topics: str) aiokafka.consumer.AIOKafkaConsumer#

Marshal kafka consumer config and build object with manual commit semantics.

class wms.config.WebsiteSpec#

Bases: NamedTuple

Website scraping spec.

Named tuples are immutable and use less memory than dataclasses.

url :yarl.URL#
frequency :datetime.timedelta#
regexp :Pattern | None#
classmethod from_dict(config: collections.abc.Mapping[str, Any], default_frequency: datetime.timedelta) WebsiteSpec#

Build scraping website scraping spec from raw configuration item.

class wms.config.ScrapingConfig#

Scraping service configuration.

topic :str#
default_scraping_frequency :datetime.timedelta#
default_scraping_timeout :datetime.timedelta#
specs :collections.abc.Sequence[WebsiteSpec]#
classmethod from_settings(settings: dynaconf.Dynaconf, config_dir: pathlib.Path) ScrapingConfig#

Build scraping service configuration from raw configuration dict.

class wms.config.PersistenceConfig#

Persistence service configuration.

Defaults are geared toward insertion efficiency, not low latency.

topic :str#
max_records :int = 1000#
timeout_ms :int = 20#
classmethod from_settings(settings: collections.abc.Mapping[str, Any]) PersistenceConfig#

Build persistence service configuration from raw configuration dict.

wms.config.build_postgres_dsn(settings: collections.abc.Mapping[str, Any]) yarl.URL#

Build PostgreSQL connection URL from config.

class wms.config.PostgresClientConfig#

Postgres client configuration.

dsn :yarl.URL#
ssl :SSLContext | SSLMode | bool#
classmethod from_settings(settings: collections.abc.Mapping[str, Any], certs_dir: Path | None = None) PostgresClientConfig#

Build PostgreSQL connection URL from config and create new client.

class wms.config.AppConfig(settings: dynaconf.Dynaconf)#

Structured app configuration object. Not used ATM.

property config_dir pathlib.Path#

Configuration root dir.

property certs_dir pathlib.Path#

Certificate Authority root dir.