Skip to content

[tools] Validate store metadata in pre-commit hook#12716

Open
biodranik wants to merge 1 commit into
masterfrom
ab/pre-commit-max-length
Open

[tools] Validate store metadata in pre-commit hook#12716
biodranik wants to merge 1 commit into
masterfrom
ab/pre-commit-max-length

Conversation

@biodranik
Copy link
Copy Markdown
Member

Add a files [--fix] FILE... subcommand to check_store_metadata.py that validates only the supplied paths and dispatches each to the appropriate check based on its path. Both fdroid and google flavors are handled the same way, plus iphone/metadata/. With --fix, trailing whitespace is rstripped and written back so the file matches what the store will actually display.

Wire this into the pre-commit hook: staged Play Store and iOS metadata files are passed through files --fix, files mutated by --fix are detected with git diff --quiet and re-staged, and the commit is blocked on validation failures (over-length, emoji, bad URL, etc.).

Also raise the listings/release-notes.txt limit from 499 to 500 to match the documented store limit. The earlier 499 was a workaround for trailing newlines that --fix now removes; this also unblocks sl/release-notes.txt which currently has exactly 500 chars.

Add a `files [--fix] FILE...` subcommand to check_store_metadata.py
that validates only the supplied paths and dispatches each to the
appropriate check based on its path. Both fdroid and google flavors
are handled the same way, plus iphone/metadata/. With --fix, trailing
whitespace is rstripped and written back so the file matches what the
store will actually display.

Wire this into the pre-commit hook: staged Play Store and iOS metadata
files are passed through `files --fix`, files mutated by --fix are
detected with `git diff --quiet` and re-staged, and the commit is
blocked on validation failures (over-length, emoji, bad URL, etc.).

Also raise the listings/release-notes.txt limit from 499 to 500 to
match the documented store limit. The earlier 499 was a workaround
for trailing newlines that --fix now removes; this also unblocks
sl/release-notes.txt which currently has exactly 500 chars.

Signed-off-by: Alexander Borsuk <me@alex.bio>
@biodranik biodranik requested review from strump and vng May 5, 2026 19:02
@biodranik biodranik requested a review from a team as a code owner May 5, 2026 19:02
Copy link
Copy Markdown
Member Author

@biodranik biodranik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an experimental LLM code review


Generated by Claude Code

'release_notes.txt': 4000,
'keywords.txt': 100,
}
IOS_URL_FILES = ('support_url.txt', 'marketing_url.txt', 'privacy_url.txt')
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limits here duplicate the inline constants in check_android() (lines 265–271) and check_ios() (lines 307–312). The existing code comment acknowledges this, but a single future edit to one set without updating the other will silently cause the files pre-commit path and the android/ios CI full-scan path to enforce different limits. Consider having check_android() and check_ios() consume these dicts directly to eliminate the second source of truth.


Generated by Claude Code

if locale not in ANDROID_LOCALES:
return error(path, 'unsupported locale: {}', locale)
if name in ANDROID_LISTING_LIMITS:
return done(path, check_raw(path, ANDROID_LISTING_LIMITS[name], fix=fix)[0])
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_raw() is called directly here without a try/except FileNotFoundError guard. check_text() (used by the full-scan subcommands) handles missing files gracefully. The pre-commit shell hook pre-filters with [ -f "$FILE" ] so the hook path is safe, but running check_store_metadata.py files path/does/not/exist.txt from the CLI will produce an unhandled traceback instead of a error message. The same applies to the check_url() and check_email() call-sites in check_path() below (lines 338 and 354).


Generated by Claude Code

def check_text(path, max, optional=False):
with open(path, 'rb') as f:
raw = f.read()
text = raw.decode('utf-8').rstrip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if raw data contains invalid utf-8 characters?

Let's wrap in try-except and handle UnicodeDecodeError.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the best way to handle UnicodeDecodeError if not to fail with that exception?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try:
    text = raw.decode('utf-8').rstrip()
except UnicodeDecodeError as e:
    return error(path, "Unicode error: " +str(e)), ""

Here str(e) would contain message:

'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

(ok, email) = check_raw(path, 500)
def check_email(path, fix=False):
(ok, email) = check_raw(path, 500, fix=fix)
ok = ok and email.find('@') != -1 and email.find('.') != -1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ok = ok and email.find('@') != -1 and email.find('.') != -1
ok = ok and '@' in email and '.' in email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants