Ace Your Interviews 🎯

Browse our collection of interview questions across various technologies.

PythonBeginner

What is Python and why is it so popular in 2026?

Answer

Python is a high-level, interpreted, dynamically-typed general-purpose programming language. It's popular because: readable English-like syntax reduces learning curve, one language covers web development (FastAPI/Django), data science (pandas/NumPy), machine learning (PyTorch/scikit-learn), and automation, massive ecosystem (PyPI with 500,000+ packages), and it's the exclusive language of AI/ML tooling.

PythonBeginner

What is the difference between a list and a tuple in Python?

Answer

Lists are mutable (can be changed after creation) and use square brackets: [1, 2, 3]. Tuples are immutable (cannot be changed) and use parentheses: (1, 2, 3). Tuples are hashable and can be used as dictionary keys or set elements. Lists have more methods (append, remove, sort). Use tuples for data that shouldn't change (coordinates, RGB colors, database records).

PythonBeginner

What is a Python decorator and how does it work?

Answer

A decorator is a function that wraps another function to add behavior without modifying it. It takes a function as argument, defines a wrapper function that adds behavior before/after calling the original, and returns the wrapper. Used via @decorator_name syntax above a function definition. Powers @login_required in Django, @app.get() in FastAPI, and @pytest.mark.parametrize in pytest.

PythonBeginner

Explain list comprehension vs a for loop with append.

Answer

List comprehension: [x * 2 for x in range

10

if x % 2 == 0] — faster (C-optimized), more Pythonic, concise. For loop: squares = []; for x in range

10

: if x % 2 == 0: squares.append(x * 2) — more readable for complex logic, necessary for multiple statements per iteration. Use comprehensions for simple transformations and filters.

PythonBeginner

What is the difference between deep copy and shallow copy?

Answer

Shallow copy (copy.copy() or list[:]) creates a new object but references the same nested objects. Modifying a nested list in the copy affects the original. Deep copy (copy.deepcopy()) creates a completely independent copy — all nested objects are also copied. Use deep copy when you need to modify nested structures independently.

PythonBeginner

What are *args and **kwargs in Python?

Answer

*args captures any number of positional arguments as a tuple. **kwargs captures any number of keyword arguments as a dictionary. def func(*args, **kwargs) accepts any combination of arguments. *args for variable positional: func(1, 2, 3). **kwargs for variable keyword: func(name='Alice', age=25). Both are commonly used in decorators to forward arguments to the wrapped function.

PythonBeginner

What is the difference between == and 'is' in Python?

Answer

== checks value equality — do two objects have the same value? 'is' checks identity — are two names bound to the exact same object in memory? 5 == 5.0 is True (same value), but 5 is 5.0 is False (different objects). A common bug: using 'is' to compare strings or integers outside Python's small integer cache (-5 to 256). Always use == for value comparison.

PythonBeginner

What is a virtual environment and why is it mandatory?

Answer

A virtual environment is an isolated Python installation with its own site-packages directory. It prevents version conflicts between projects — Project A needs Django 4.2 and Project B needs Django 3.2 can coexist because each project has its own virtualenv. Created with python -m venv venv, activated with source venv/bin/activate. Without it, all projects share one Python installation and inevitably conflict.

PythonBeginner

What is the GIL in Python and what does it affect?

Answer

The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. This means Python threads don't provide true parallelism for CPU-bound tasks — even on multi-core processors. The GIL does NOT affect I/O-bound operations (HTTP calls, file reads). For CPU parallelism, use multiprocessing. NumPy and PyTorch release the GIL during heavy C-level operations.

PythonBeginner

How does Python's garbage collection work?

Answer

Python uses reference counting as the primary memory management strategy — every object has a count of references to it; when it reaches zero, the memory is freed immediately. A cyclic garbage collector handles reference cycles (object A references B, B references A — both have non-zero counts but are unreachable). The gc module exposes the cyclic collector. Memory leaks in Python usually mean keeping references alive unintentionally.

PythonIntermediate

Explain Python's data model and dunder methods.

Answer

Python's data model defines how objects behave with built-in operations via dunder (double underscore) methods. __len__ makes len(obj) work. __getitem__ enables obj[key]. __iter__ and __next__ enable for loops. __eq__ and __hash__ define equality and dict key behavior. __enter__/__exit__ enable the with statement. Implementing these makes custom classes integrate seamlessly with Python's syntax and built-in functions.

PythonIntermediate

What is a generator and when should you use one instead of a list?

Answer

A generator is a function that uses yield instead of return — it returns values lazily, one at a time, rather than computing all values upfront. Use generators for large datasets that don't fit in memory, infinite sequences, and data pipelines. A list of 10 million integers takes ~400MB; a generator producing them uses ~100 bytes. Generator expressions: (x**2 for x in range(10**6)) vs list comprehension which loads all into RAM.

PythonIntermediate

What is the difference between @staticmethod, @classmethod, and an instance method?

Answer

Instance method: first parameter is self (the instance) — accesses and modifies instance state. @classmethod: first parameter is cls (the class) — accesses class-level attributes, used for alternative constructors (Product.from_dict(data)). @staticmethod: no self or cls — a regular function logically grouped in the class, doesn't access instance or class state. Used for utility functions related to the class.

PythonIntermediate

How does asyncio work? What is the event loop?

Answer

asyncio implements cooperative multitasking with a single-threaded event loop. When a coroutine hits await, it suspends and yields control back to the event loop. The loop picks the next ready coroutine to run. When the awaited I/O completes (HTTP response, DB query result), the event loop schedules the suspended coroutine to resume. This enables thousands of concurrent I/O operations without threads — no GIL issues, no thread synchronization overhead.

PythonIntermediate

What is data leakage in machine learning and how do you prevent it?

Answer

Data leakage occurs when information from the test set influences model training, producing optimistically inflated evaluation metrics that don't generalize. Common causes: fitting scalers on entire dataset before splitting, target encoding with full dataset statistics, feature selection using test labels. Prevention: use scikit-learn Pipeline which fits preprocessing only on training data, always split before any transformation, be careful with time-series data (respect temporal ordering).

PythonIntermediate

Explain the difference between SQL and NoSQL databases and when to use each with Python.

Answer

SQL (PostgreSQL, MySQL): structured relational data, ACID transactions, complex queries with JOINs. Use for user accounts, orders, financial records — anything with relationships and consistency requirements. Python: SQLAlchemy, psycopg2, asyncpg. NoSQL (MongoDB): flexible schema, horizontal scaling, document storage. Use for product catalogs, user activity logs, chat messages — unstructured or semi-structured data. Python: PyMongo, Motor (async).

PythonIntermediate

How do you test FastAPI endpoints with pytest?

Answer

Use httpx.AsyncClient with the FastAPI app as the transport. Create a test database with conftest.py fixtures — override the get_db dependency to use a test database. Use pytest-asyncio for async test functions. Test: status codes, response shapes, database state after mutations, authentication flows, edge cases. Example: async with AsyncClient(app=app, base_url='http://test') as client: response = await client.post('/users', json={...}); assert response.status_code == 201.

PythonIntermediate

What are Python metaclasses?

Answer

A metaclass is a class whose instances are classes themselves. When Python processes a class definition, it calls the metaclass to create the class object. By default, the metaclass is type. Custom metaclasses intercept class creation: you can modify class attributes, enforce naming conventions, register classes automatically, or inject methods. Django's ORM uses metaclasses to turn class attributes into database columns. Rarely needed — if you're asking 'should I use a metaclass?', the answer is usually no.

PythonIntermediate

What is Pydantic and why is it used in FastAPI?

Answer

Pydantic is a data validation and serialization library using Python type hints. It validates that input data matches declared types (including nested models, custom validators), converts types automatically (string '42' → int 42), and serializes Python objects to JSON. FastAPI uses Pydantic models for request body parsing/validation, response serialization, and auto-generating OpenAPI documentation. Together, they eliminate manual request parsing, manual validation, and API documentation writing.

PythonIntermediate

How do you handle database migrations in a Python backend project?

Answer

Alembic is the standard database migration tool for SQLAlchemy projects. It autogenerates migration scripts by comparing your current SQLAlchemy models to the current database schema (alembic revision --autogenerate -m 'add user table'). Migrations are Python files with upgrade() and downgrade() functions. Apply with alembic upgrade head. Django has its own built-in migrations system (python manage.py makemigrations + migrate). Always commit migration files to git.

PythonIntermediate

Explain the difference between multiprocessing, threading, and asyncio in Python.

Answer

asyncio: single thread, event loop, cooperative multitasking via coroutines — best for I/O-bound concurrency (HTTP, DB, file I/O). threading: multiple OS threads in one process, GIL limits CPU parallelism — use for I/O-bound work with blocking libraries. multiprocessing: multiple separate Python processes, true CPU parallelism — use for CPU-bound work (ML inference, image processing, data computation). asyncio scales to millions of connections; multiprocessing scales to CPU core count.

PythonAdvanced

How would you architect a Python-based ML serving system for 10,000 requests per second?

Answer

FastAPI with async endpoints and async ML inference where possible. Model loaded once at startup (lifespan event), not per request. PyTorch with TorchScript or ONNX for CPU optimization; TensorRT for GPU. Redis cache for repeated predictions. Horizontal scaling with multiple FastAPI replicas behind NGINX/Traefik load balancer. Request batching for GPU inference. Celery for heavy background processing. Separate feature computation from inference. Monitor with Prometheus + Grafana.

PythonAdvanced

What is RAG (Retrieval-Augmented Generation) and how do you build a production RAG pipeline?

Answer

RAG combines vector search (retrieval) with LLM generation. Pipeline: document chunking with overlap, embedding with a sentence transformer model, vector storage in ChromaDB or Pinecone, semantic similarity search at query time (MMR for diversity), context assembly, LLM generation with retrieved context. Production concerns: chunking strategy for your document type, embedding model selection (quality vs cost), query expansion, reranking retrieved chunks, evaluation with RAGAS or TruLens, caching embeddings.

PythonAdvanced

How do you prevent and diagnose memory leaks in a long-running Python process?

Answer

Profile with tracemalloc (built-in) or memory_profiler. Common causes: references kept alive in module-level collections (caches, registries), closures capturing large objects, cycles not collected, unclosed file/network connections, appending to lists indefinitely. Fix: use weakref for caches, explicitly del large objects, use generators instead of lists for large data, ensure async resources are properly closed (async with), use object pools. Monitor process RSS memory in production with Prometheus.

PythonAdvanced

Explain Python's descriptor protocol.

Answer

A descriptor is an object that defines __get__, __set__, or __delete__ — controlling attribute access on the class that contains it. When you access obj.attr, Python calls type(obj).__dict__['attr'].__get__(obj, type(obj)) if the attribute implements the descriptor protocol. This is how @property works, how functions become bound methods, how Django ORM fields work, and how SQLAlchemy's Column() creates database-mapped attributes. Custom descriptors enable lazy loading, validation on assignment, and computed properties.

PythonAdvanced

How do you profile and optimize a slow Python function?

Answer

Profile first — never optimize blind. cProfile: python -m cProfile -s cumulative script.py (call tree). line_profiler: @profile decorator on the suspect function (pip install line-profiler). py-spy: sampling profiler for production (no code modification). Common fixes: replace Python loops with NumPy vectorized operations, use __slots__ to reduce memory in classes with many instances, use lru_cache for expensive repeated computations, move hot paths to Cython or Numba, use bisect for sorted searches instead of linear scan.

PythonAdvanced

What are Python's concurrency primitives and how do you coordinate async tasks?

Answer

asyncio.gather: run multiple coroutines concurrently, collect all results. asyncio.TaskGroup (Python 3.11+): structured concurrency — all tasks cancelled if any fails. asyncio.Semaphore: limit concurrent operations (rate limiting). asyncio.Event: signal between coroutines. asyncio.Queue: producer-consumer coordination. asyncio.Lock: mutual exclusion for shared state. asyncio.timeout (Python 3.11+): cancel operations that take too long. Use TaskGroup over gather for better error handling and structured lifecycle.

PythonAdvanced

How do you design a Python microservices architecture?

Answer

Each service is a FastAPI application in its own Docker container. Communication: synchronous inter-service calls via httpx (HTTP), asynchronous via Kafka or RabbitMQ (Python producers/consumers with aiokafka). Service discovery: Docker Compose for dev, Kubernetes for production. Shared code: published as a private PyPI package (common types, utilities). Distributed tracing: OpenTelemetry Python SDK → Jaeger. Each service has its own database. Authentication: JWT validated at API gateway, not per-service.

PythonAdvanced

What is Python's import system and how does sys.path work?

Answer

When you import a module, Python searches sys.path in order: built-ins first, then the script's directory, then PYTHONPATH environment variable entries, then site-packages (pip-installed packages). import mymodule: Python looks for mymodule.py or mymodule/__init__.py. Circular imports are resolved by partially-initialized modules — avoid by moving imports inside functions or restructuring code. Custom importers can be registered via sys.meta_path hooks (how coverage.py and import-time patching work).

PythonAdvanced

How do you implement a distributed task queue with Celery in Python?

Answer

Celery uses a broker (Redis or RabbitMQ) for task queuing and an optional result backend (Redis, PostgreSQL) for storing results. Define tasks with @celery_app.task decorator. Send tasks asynchronously with task.delay() or task.apply_async(). Workers run in separate processes: celery -A myapp worker --loglevel=info. Scheduled tasks via Celery Beat: celery -A myapp beat. Key patterns: task retries with autoretry_for, task routing to specialized queues, idempotency for reliability, chord/group/chain for task workflows.

PythonAdvanced

How do you evaluate and improve a machine learning model in production?

Answer

Monitor data drift (feature distribution changes) with tools like Evidently or WhyLogs. Track prediction drift (output distribution changes). Log all predictions and true labels for offline evaluation. A/B test new models against the baseline with traffic splitting. Set up automated retraining triggers when drift exceeds threshold. Use shadow mode deployment — run new model in parallel without affecting production. Key metrics beyond accuracy: latency percentiles (p95, p99), memory usage, throughput, and business metrics (not just ML metrics).

PythonAdvanced

What is Python's __slots__ and when should you use it?

Answer

__slots__ = ['x', 'y'] in a class definition prevents Python from creating a per-instance __dict__, storing attributes in a fixed-size array instead. Benefits: 30–50% memory reduction for classes with many instances (Point, Vertex, trade record classes), slightly faster attribute access. Costs: cannot add new attributes dynamically, no default values without __init__, inheritance complications. Use when creating millions of small objects — NumPy alternatives are usually better for numerical data.

Ready for a real challenge?

Master Python