CHORE: Add LCOV_EXCL_LINE markers to LOG() statements for coverage analysis#556
CHORE: Add LCOV_EXCL_LINE markers to LOG() statements for coverage analysis#556gargsaumya wants to merge 4 commits into
Conversation
- Added LCOV_EXCL_LINE markers to 284 LOG() diagnostic statements in C++ code - Excludes debug/diagnostic logging from code coverage metrics - Affected files: connection.cpp, connection_pool.cpp, ddbc_bindings.cpp - Includes automation script (add_lcov_exclusions.py) for future maintenance - Expected coverage improvement: ~3-5% (excluding non-functional diagnostic lines)
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 67.9%
mssql_python.row.py: 70.5%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.connection.connection.cpp: 77.2%
mssql_python.__init__.py: 77.3%
mssql_python.pybind.ddbc_bindings.cpp: 78.2%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.connection.py: 85.3%
mssql_python.logging.py: 85.5%🔗 Quick Links
|
Critical fixes to achieve 85%+ coverage: 1. Extended LCOV_EXCL_LINE markers to all continuation lines (284->593 markers) 2. Updated generate_codecov.sh to filter marked lines from coverage 3. Updated .coveragerc and .gitignore for utility scripts and artifacts Expected impact: 79% -> 85-87% coverage
- Properly recalculate LH (lines hit) and LF (lines found) after filtering - Ensures lcov --summary and genhtml show correct line counts - Coverage % already correct at 81%, this fixes internal consistency - All coverage tools will now show matching statistics
Add test_021_coverage_edge_cases.py with 10 validated tests targeting uncovered C++ code paths: - DECIMAL(38,18) with maximum precision - TIME(7) and TIME(0) for SQL_SS_TIME2 paths - DATETIMEOFFSET with extreme timezones (±14:00) and year boundaries - DATE standalone type boundaries - UNIQUEIDENTIFIER edge cases (all-zeros, all-Fs, nil UUID) - VARBINARY with empty, small, and 16KB (LOB path) values - BINARY(10) fixed-length padding behavior - Combined multi-type result sets These tests target ddbc_bindings.cpp gaps identified in coverage analysis. All tests pass locally and are based on proven patterns from existing tests. Target: Increase coverage from 81% to 85%+
bewithgaurav
left a comment
There was a problem hiding this comment.
A revamped take on the LOG-coverage gap
I called this exact gap out in the logging framework PR a few months ago: #312 (comment). at the time I rejected the LCOV markers approach for the same reasons that show up here: 150+ manual changes, clutters the codebase, hides executable code paths. coming back to it now I think there's a third option we didn't consider then that gets the same metric lift without any of those costs.
the 600 LCOV_EXCL_LINE markers and the python filter in generate_codecov.sh can be replaced with one tiny pre-build step plus one extra lcov flag. same coverage number, cleaner source tree, no helper scripts.
ran it end-to-end on macos arm64. full pytest suite including the logging integration tests:
PR's machinery: 3916/5079 = 77.10%
proposed flow: 3912/5052 = 77.43%
1733/1733 tests pass. logging still works at runtime, LoggerBridge gets normal coverage, nothing about behavior changes.
the proposal
- add a small script (
scripts/join_logs_for_coverage.py, ~50 lines) that joins multi-line LOG calls onto single lines. only used in the codecov build, never committed source changes. - in
mssql_python/pybind/build.shcodecov mode, snapshot the source, run the join, build, restore on EXIT viatrap(~12 lines). - in
generate_codecov.sh, swap the embedded python filter for--omit-lines '\bLOG[A-Z_]*\('on the existinglcov -aline.
joining preserves runtime behavior because adjacent string literals concatenate at compile time. the .so built from joined source is bit-for-bit equivalent to one built from the original.
what gets deleted
- all 600
// LCOV_EXCL_LINEmarkers from the cpp files add_lcov_exclusions.pyandfix_multiline_log_exclusions.pyat repo root- the embedded ~90-line
PYTHON_FILTERblock ingenerate_codecov.sh
tests/test_021_coverage_edge_cases.py and .gitignore additions stay.
small bonus
the helper scripts use 'LOG(' in line which misses LOG_ERROR( and LOG_WARNING(. 11 sites currently unmarked (9 LOG_ERROR + 1 LOG_WARNING + their continuations). the proposed regex catches them. that's the +0.33pp.
Work Item / Issue Reference
Summary
This pull request introduces a script to automate the exclusion of diagnostic logging statements from code coverage analysis and applies these exclusions throughout the C++ connection codebase. The primary change is the addition of the
// LCOV_EXCL_LINEmarker to allLOG()statements, ensuring that these diagnostic lines are not counted toward code coverage metrics. This improves the accuracy of coverage reports by focusing on executable business logic rather than logging.Key changes include:
Automation Script:
add_lcov_exclusions.py, which scans C++ source files and appends the// LCOV_EXCL_LINEmarker to anyLOG()statement that does not already have it. This script is designed to be run on all relevant files to automate the process and ensure consistency.Code Coverage Exclusions in C++ Source:
Updated all
LOG()statements inmssql_python/pybind/connection/connection.cppto include the// LCOV_EXCL_LINEmarker. This affects logging throughout connection allocation, connection/disconnection, transaction management, attribute setting, error handling, and more. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]Applied the same exclusion marker to
LOG()statements in connection pool management code (mssql_python/pybind/connection/connection_pool.cpp), ensuring that pool-related logging is also excluded from coverage. [1] [2] [3] [4]These changes standardize the treatment of logging across the codebase and provide a maintainable, automated approach for future code coverage analysis.