NNS Python API Status¶
This page summarizes the public NNS Python API surface, known gaps, guarded paths, and design boundaries.
NNS Python is a stable, parity-focused Python port of installed R NNS 13.0, implemented natively in Python on top of NumPy and SciPy. It does not wrap R, call the R package at runtime, or depend on compiled R/C++ shims. The goal is public input/output compatibility where R behavior is stable, documented, and useful. The goal is not to copy every R internal helper name, data-frame quirk, or runtime side effect as a public Python API.
Current release-relevant state: the core partial-moment APIs, deterministic
regression/classification/forecasting surfaces, and scalar/vectorized
multivariate derivative modes are parity-covered on focused fixtures. The
largest remaining API work is now mostly ergonomic: categorical predictor
preparation is explicit through prepare_factor_predictors(...), while direct
raw-factor nns_m_reg(..., factor_2_dummy=True) remains guarded because the
installed R internal path errors. Named R data-frame factor ordering quirks are
documented as outside NNS Python' positional-column API boundary. Performance gaps
remain mostly in large stochastic-dominance workloads where R uses compiled
kernels.
Status labels:
implemented: covered public behavior with no known release-blocking gap.partial: useful public behavior exists, with documented guarded paths or caveats.guarded: intentionally rejected with an explicit error.known gap: public structure may exist, but parity is not yet aligned.
Confidence labels are release-maintainer judgments based on current parity, invariant, and property coverage.
Public API Status¶
| API / group | Status | Confidence | Notes |
|---|---|---|---|
Core partial moments: lpm, upm, lpm_ratio, upm_ratio |
implemented | high | Matches R partial-moment conventions, including degree-zero equality handling. |
Partial-moment matrices and n-dimensional wrappers: pm_matrix, co_lpm_nd, co_upm_nd, dpm_nd |
implemented | high | Public matrix and n-dimensional partial-moment surfaces are covered. |
Pairwise co-moments: co_lpm, co_upm, d_lpm, d_upm |
implemented | high | Python raises on length mismatch instead of silently truncating like R. |
Classical helpers: ecdf_pm, mean_pm, var_pm, skew_pm, kurt_pm, nns_moments |
implemented | high | Population-normalized defaults are documented in docs/conventions.md. |
VaR helpers: lpm_var, upm_var |
implemented | high | Used by deterministic confidence and prediction interval paths. |
Central tendencies: nns_gravity, nns_mode, nns_rescale |
implemented | high | Public helper behavior is covered through direct and dependent tests. |
Dependence and correlation: nns_dep, nns_cor |
implemented | high | Follows installed R bivariate public path; dependence can be below signed correlation magnitude in known R-compatible cases. |
Copula: nns_copula |
implemented | high | Bivariate scalar public form is implemented. |
Causation: nns_causation, causal_matrix |
implemented | medium | Numeric lag paths and tau="ts" behavior are covered; some internal asymmetry granularity can differ in regression dimension reduction. |
Distribution functions: nns_cdf |
implemented | high | Deterministic non-plotting paths are implemented; plotting is ignored. |
Distance helpers: nns_distance, nns_distance_bulk |
implemented | high | Numeric and classification conventions follow installed R behavior. |
Partitioning: nns_part |
implemented | high | Returns plain dictionaries/arrays instead of R data.table objects. |
Regression: nns_reg |
implemented | high | Numeric, class-code, confidence interval, smoothing, dimension-reduction, and public factor-expansion paths are covered. |
Multivariate regression: nns_m_reg |
partial | medium-high | Numeric and class paths are implemented; use prepare_factor_predictors(...) for categorical design matrices before calling nns_m_reg. Direct raw factor expansion remains guarded. |
Stack: nns_stack |
implemented | medium | Numeric/class paths, intervals, factor expansion, and ts_test are covered; exact stochastic sample parity is not expected. |
Boost: nns_boost |
partial | medium | Deterministic and stochastic structures are implemented; one high-feature threshold path remains guarded to match installed-R failure behavior. |
Seasonality: nns_seas |
implemented | high | Non-plotting installed-R path is implemented and cached defensively. |
ARMA and VAR: nns_arma, nns_arma_optim, nns_var |
partial | medium | Numeric forecasting and supported VAR dimension-reduction paths are implemented on focused fixtures. Explicit numeric multi-lag ARMA uses actual-lag weighting instead of installed R's position-based weighting quirk. VAR's multivariate stack stage matches R's effective time-series holdout sizing; the remaining macro-like VAR strict xfail is inherited from ARMA optimizer period selection. Stochastic interval streams are structural/statistical parity only. |
Bootstrap/Monte Carlo: nns_meboot, nns_mc |
implemented | medium | Deterministic diagnostics are parity-tested; exact stochastic replicate parity with R is not expected. |
Stochastic dominance/superiority: fsd, ssd, tsd, .uni wrappers, nns_ss, nns_sd_cluster, sd_efficient_set |
implemented | medium | Public structures and deterministic paths are covered. SD uses exact pure-NumPy prefix-pair kernels plus a degree-1 discrete order-statistic matrix path; R's C++ core remains faster on full finance fixtures. Stochastic intervals use NNS Python RNG. |
ANOVA: nns_anova |
implemented | high | Binary, multi-group, pairwise, and degenerate NaN conventions are covered. |
Normalization: nns_norm |
implemented | high | Numeric matrix path is implemented. |
Categorical helpers: encode_factor_codes, factor_2_dummy, factor_2_dummy_fr, prepare_factor_predictors |
implemented | high | Explicit levels= / factor_levels= should be used to reproduce R factor ordering. prepare_factor_predictors(...) exposes the regression-ready full-rank design matrix path. |
Scalar differentiation: nns_diff, dy_dx |
implemented | high | dy_dx(..., eval_point="overall") and numeric evaluation points are covered. |
Multivariate differentiation: dy_d |
partial | medium-high | Scalar and vectorized point/distribution modes are covered on focused fixtures. Mixed derivatives are supported for two-regressor inputs where defined; multi-row matrix mixed derivatives use pointwise Python semantics rather than R's order-dependent list-matrix packing quirk. |
Guarded And Deferred Paths¶
| Area | Path | Current behavior | Reason / next action |
|---|---|---|---|
| Multivariate regression | direct factor_2_dummy=True raw predictor path |
Guarded with NotImplementedError in direct nns_m_reg(..., factor_2_dummy=True). |
Installed R direct NNS.M.reg raw factor input errors. Use prepare_factor_predictors(...) first, or use the public nns_reg(..., factor_2_dummy=True, factor_levels=...) expansion path. |
| Boost | threshold on the n_features > 10 stochastic path |
Guarded with NotImplementedError on the high-feature stochastic epoch path. |
Installed R errors because test.features is never built. NNS Python keeps this explicit. |
| Boost/factor predictors | named data-frame factor predictor ordering | Deferred, not represented as a named-column API. | NNS Python uses positional X1, X2, ... semantics. Installed R named data frames can reorder columns alphabetically before data.matrix. |
Intentional Design Boundaries¶
- No hidden network fetching happens by default.
- Library code does not auto-load
.envfiles. - External data clients and dataframe libraries are not dependencies.
- NNS Python uses explicit Python errors for some cases where R silently truncates,
coerces, warns, or returns unusable values. Important divergences are recorded
in
docs/conventions.md. - Stochastic exact stream parity is not expected. Stochastic paths use NumPy RNG and are tested structurally/statistically.
- Plotting side effects from R APIs are generally ignored; NNS Python returns data.
- Stochastic-dominance performance work stays pure NumPy for now. The current implementation mirrors R's sorted-column/prefix-sum algorithm and adds Python-specific guard pruning, kept-only active-set scans for degree 2/3 and degree-1 continuous calls, and an exact order-statistic matrix for large degree-1 discrete calls. Optional compiled SD backends remain deferred until benchmark evidence justifies the added packaging and maintenance cost.
Intentional Divergences And Caveats¶
The detailed behavior notes live in docs/conventions.md. Release-relevant
examples include:
- Empty numeric inputs raise
ValueError; R NNS often returnsNaN. - Co-moment length mismatches raise
ValueError; R warns, truncates, and divides by the longer length. - Factor and class labels are explicit. R factor levels become numeric codes;
NNS Python callers should pass
levels=orclass_levels=when ordering matters. - Public outputs use NumPy arrays and plain dictionaries instead of R
data.tableobjects. - Some installed-R quirks are intentionally matched when they affect stable
public output, such as selected interval and
ts_testconventions. - Practical example parity checks live in
tests/parity/test_practical_examples.py. Current passing coverage includes partial-moment equivalences, curve fitting, regression residuals, Boston Housing, and the macro-like VAR multivariate stage. Strict xfails track current installed-R deviations in the Iris classification vignette, the documented ARMA numeric multi-lag weighting divergence, and VAR's ARMA-derived univariate/ensemble outputs. The Iris classification xfail mixes two different issues: NNS Python stack predicts the correct held-out class where installed R NNS 13.0 rounds the same borderline estimate down, while boost remains a true output disparity whose installed-R and NNS Python balanced predictions both miss the held-out class.
Release-Relevant Caveats¶
- The public API is stable and parity-focused. Behavior is not expected to break across minor releases.
- This is not full R parity yet.
dy_dscalar and vectorized point/distribution modes are covered on focused fixtures. Multi-row mixed derivative point matrices intentionally use pointwise Python semantics instead of R's order-dependent packing quirk.- Optional provider support should remain explicit and dependency-light.
- Version changes and release metadata should be handled separately from API status documentation.
Internal Or Out Of Scope¶
Some R NNS helper names are implementation details or lower-level surfaces in
the R package rather than APIs NNS Python should expose one-for-one. Examples include
NNS.ANOVA.bin, Uni.caus, compiled *_cpp shims, sampling helpers, and
generated-vector helpers.
NNS Python implements the corresponding behavior natively in Python where it is
needed by public APIs. It does not mirror every R helper name as a top-level
Python export. Matrix-style public behavior is exposed where supported through
Python names such as causal_matrix; not exposing an exact R helper name does
not mean the implementation delegates to R or compiled code.