"Leveraging OCR and neural networks to solve Sudoku puzzles from images."
ROGER Robin
\0/
The OCR Sudoku Solver is a tool capable of solving any Sudoku puzzle by recognizing and processing an image of the grid. This project allowed me to explore the functionality of neural networks and gain hands-on experience with image processing techniques, including the Hough Transform algorithm.
The main goal of the project was to develop a system that recognizes and solves Sudoku puzzles from images.
Core Functionalities:
Step-by-Step Process:
Image preprocessing is the first crucial step in the Sudoku-solving pipeline. Its purpose is to optimize the input image's quality by addressing issues like noise, color dominance, and rotation to ensure accurate subsequent steps.
Enhance the contrast by identifying dominant colors and converting them to white, followed by grayscale transformation. This step improves detail visibility and prepares the image for further processing.
Figure 1: Before and After Preprocessing
Algorithms Considered:
Gray = 0.22 * R + 0.65 * G + 0.11 * B
Gray = 0.299 * R + 0.587 * G + 0.114 * B
Figure 2: Grayscale Conversion
Converts the image to black and white by analyzing grayscale intensity and dynamically calculating thresholds based on pixel distribution. Adaptive thresholds ensure robustness against varied lighting conditions.
Figure 3: Adaptive Binarization
Detects and corrects skewed grids for accurate grid alignment:
Figure 4: Rotation Correction
Overview:
Grid detection involves identifying and extracting the Sudoku grid from an image using edge detection, histograms, and segmentation techniques. This ensures precise localization of the grid for further processing.
Key Steps:
Figure 1: Gradient Edge Detection
G(x, y) = √(Gh(x, y)² + Gv(x, y)²). The result was
thresholded to create a binary image highlighting the grid
contours.
Figure 2: Combined Gradient and Contour Detection
Figure 3: Histogram-Based Grid Localization
Figure 4: Cropped Grid and Individual Cells
Overview:
Neural networks are a fundamental component of this project, enabling accurate recognition of Sudoku digits. Without a properly configured neural network, the system would fail to recognize grid digits correctly, preventing successful puzzle-solving.
Key Steps:
Figure 1: Matrix and Neural Network Structures
Figure 2: XOR Neural Network Architecture
Figure 3: Dataset Format and Image Structure
Figure 4: Recognition Results on Sudoku Test Images
Overview:
The result display integrates image processing, neural network-based digit recognition, and user interaction to overlay the solution onto the original Sudoku image. This robust approach ensures transparency and user control throughout the solving process.
Key Steps:
Figure 1: Home and Download Pages
Figure 2: Application Options and Step-by-Step View
Figure 3: File Browser for Image Selection
Figure 4: Digit Recognition and Correction
Figure 5: Final Solved Sudoku
Tools and Technologies Used:
Highlight Innovation:
A unique approach in this project was the use of the Hough Transform for grid detection, which enabled precise extraction of the Sudoku grid from images. Additionally, implementing a neural network in C required leveraging efficient data structures and optimizing matrix operations for high performance.
Team Members:
My Role:
My primary contribution was developing the grid detection module. By utilizing the Hough Transform algorithm, I ensured precise identification and extraction of the Sudoku grid, which was crucial for subsequent steps in the project.
You can download the executable for this project by clicking the button below: